Wednesday, 8 February 2012

Submitted what I have so far for my super to have a look and hope I am on the right track, with only about so nervously wait feedback...


In the meantime I will draw up my Gantt Chart project schedule and I have ethics and risks forms to complete that have to be attached to my proposal, unsure if I need to go over these with my super or just to get a signature, will have to check that.


I still have so much more researching to do and I feel that if we had more time I could have written a more in depth proposal but seeing as the time was constrained I have only been able to skim the top of it all.


I also have to re-acquaint myself with how XML works so need to allocate a good amount of time in my Gantt chart to allow for this.


And learn how to analyse and become familiar with statistics and methods.


Ah Just remembered something else that I could add into my proposal...

Tuesday, 7 February 2012

I am just adding the finishing touches to my proposal it is still not at the 5000 word stage so just trying to add other relevant information to it, but I will have it finished by tonight and submitted to my super for feedback.


I got my glasses today so reading the papers is certainly a whole lot easier now, no more screwed up facial expressions and sore heads! Almost a joy....almost :P

Monday, 6 February 2012

Decided that i have done enough reading to cover the proposal at least, there is much much MUCH more to read and research but the time has came to finish up my proposal and include all information I have gathered thus far, this is such a huge field and I could easily spend the next 3 months just reading papers and trying to decide what to do and when, so time to bite the bullet and put my plan on paper! And get some feedback from my super and hope I am on the right track! :/

A little it of self doubt is creeping in whether I can actually pull this off,as my XML skills aren't that great, so working on an each day at a time basis!

Sunday, 5 February 2012

I have spent yesterday researching LSI methodology and I really don't think this way will suit my purpose, this is very mathematical and doesn't mine the information that i need, although i still need to consider Suffix stripping but will have to find an alternative.


I will be relying on Quantatitve information, statistics and data forecasting.


Today i will spend researching using the corpus methods general corpus/ Araucaria corpus so I can reach a decision and include in my proposal.


I am only approx 2000 words into it but I feel I am not explaining well enough what my intentions are, and how I am going to achieve the outcome, i need to pin point exactly.


But I think I will only focus on Twitter at the moment, if i get it nailed it can be used on other social networking sites.




 

Friday, 3 February 2012

I found these white papers on text mining, very interesting breakdown of polysemy and synonymy, I think these might point me in a good direction on how to detect sentiments.


http://www.ijcaonline.org/volume28/number2/pxc3874633.pdf

http://wvoca.com/r/3/1/Read_Attach/3a7ce958b3b6ae1f284e45e5ca6475d5/Porter%20Stemmer.pdf

I also need to read more on Latent Semantic Indexing (LSI) as this is also where i might be able to create some sort of algorithm or at least help me to understand what I need to do more.  That is tonight's task!

Thursday, 2 February 2012

Had the best email ever! Not 20 pages 7-8 should do it! #mademynight!
Hmm Literature Review....I really should have found out how to do this before now!


The following links have been useful so far to me:


http://www.the-data-mine.com/bin/view/Software/MostPopularDataMiningSoftware

http://www.thetweetographer.com/

http://datamining.typepad.com/data_mining/2010/08/data-journalist-david-mccandless.html

http://datamining.typepad.com/data_mining/2010/01/the-mathematics-of-modern-war.html
Had a great meeting with my Lecturer yesterday and had a eureka moment! I have reached a decision on a research question:


Is it possible to predict the likelihood of Scottish Independence by mining micro blogging? 


I have watched a few (lots) videos about the different types of open source text mining software and I think that I will be using Rapid Miner although i haven't fully tested this out, I will probably dabble with a few others before i decide for sure.


Now that I have my research question I can start on my proposal, 20 pages in a week? My lecturer would like to see proposal before I submit. 


Head down time, eat sleep and live at my desk for the next few days, no time for bathing and kids will have to look after themselves!


On with the search to full fill my literature review requirements....