I'm thinking about (going to try) using the Google Prediction API:
https://cloud.google.com/prediction/ to detect spam. However, in order to do so, it needs to be trained to know what spam and not spam looks like. Would it be possible for the mods or admins to give me a file with all of the posts that were deleted for being spam and the topic they were posted in?
If that is not possible, could the mods, as you delete spam posts, put them into this spreadsheet:
https://docs.google.com/spreadsheets/d/16frPDZkHcg-WYuWtj_Qqkc0fzPtoj4kBKrjpCrlU9h4/edit?usp=sharing on the sheet labeled SPAM.
Users can also help. If you see a post that you think is spam, you can put it into the above spreadsheet. You can also put posts that you think are good and not spam into the sheet on the sheet labeled NOT SPAM. If you put things into the spreadsheet, it would be best not to include quoted stuff.
I know that this sounds like a lot of work for users and mods, but I also think that having a prediction model for spam would be a beneficial thing for this forum.
If anyone wants to help me here, feel free and please do so. If anyone has any suggestions for me for any part of this project, please let me know.