Submitted what I have so far for my super to have a look and hope I am on the right track, with only about so nervously wait feedback...
In the meantime I will draw up my Gantt Chart project schedule and I have ethics and risks forms to complete that have to be attached to my proposal, unsure if I need to go over these with my super or just to get a signature, will have to check that.
I still have so much more researching to do and I feel that if we had more time I could have written a more in depth proposal but seeing as the time was constrained I have only been able to skim the top of it all.
I also have to re-acquaint myself with how XML works so need to allocate a good amount of time in my Gantt chart to allow for this.
And learn how to analyse and become familiar with statistics and methods.
Ah Just remembered something else that I could add into my proposal...
Wednesday, 8 February 2012
Tuesday, 7 February 2012
I am just adding the finishing touches to my proposal it is still not at the 5000 word stage so just trying to add other relevant information to it, but I will have it finished by tonight and submitted to my super for feedback.
I got my glasses today so reading the papers is certainly a whole lot easier now, no more screwed up facial expressions and sore heads! Almost a joy....almost :P
I got my glasses today so reading the papers is certainly a whole lot easier now, no more screwed up facial expressions and sore heads! Almost a joy....almost :P
Monday, 6 February 2012
Decided that i have done enough reading to cover the proposal at least, there is much much MUCH more to read and research but the time has came to finish up my proposal and include all information I have gathered thus far, this is such a huge field and I could easily spend the next 3 months just reading papers and trying to decide what to do and when, so time to bite the bullet and put my plan on paper! And get some feedback from my super and hope I am on the right track! :/
A little it of self doubt is creeping in whether I can actually pull this off,as my XML skills aren't that great, so working on an each day at a time basis!
Sunday, 5 February 2012
I have spent yesterday researching LSI methodology and I really don't think this way will suit my purpose, this is very mathematical and doesn't mine the information that i need, although i still need to consider Suffix stripping but will have to find an alternative.
I will be relying on Quantatitve information, statistics and data forecasting.
Today i will spend researching using the corpus methods general corpus/ Araucaria corpus so I can reach a decision and include in my proposal.
I am only approx 2000 words into it but I feel I am not explaining well enough what my intentions are, and how I am going to achieve the outcome, i need to pin point exactly.
But I think I will only focus on Twitter at the moment, if i get it nailed it can be used on other social networking sites.
I will be relying on Quantatitve information, statistics and data forecasting.
Today i will spend researching using the corpus methods general corpus/ Araucaria corpus so I can reach a decision and include in my proposal.
I am only approx 2000 words into it but I feel I am not explaining well enough what my intentions are, and how I am going to achieve the outcome, i need to pin point exactly.
But I think I will only focus on Twitter at the moment, if i get it nailed it can be used on other social networking sites.
Friday, 3 February 2012
I found these white papers on text mining, very interesting breakdown of polysemy and synonymy, I think these might point me in a good direction on how to detect sentiments.
http://www.ijcaonline.org/volume28/number2/pxc3874633.pdf
http://wvoca.com/r/3/1/Read_Attach/3a7ce958b3b6ae1f284e45e5ca6475d5/Porter%20Stemmer.pdf
I also need to read more on Latent Semantic Indexing (LSI) as this is also where i might be able to create some sort of algorithm or at least help me to understand what I need to do more. That is tonight's task!
http://www.ijcaonline.org/volume28/number2/pxc3874633.pdf
http://wvoca.com/r/3/1/Read_Attach/3a7ce958b3b6ae1f284e45e5ca6475d5/Porter%20Stemmer.pdf
I also need to read more on Latent Semantic Indexing (LSI) as this is also where i might be able to create some sort of algorithm or at least help me to understand what I need to do more. That is tonight's task!
Thursday, 2 February 2012
Hmm Literature Review....I really should have found out how to do this before now!
The following links have been useful so far to me:
http://www.the-data-mine.com/bin/view/Software/MostPopularDataMiningSoftware
http://www.thetweetographer.com/
http://datamining.typepad.com/data_mining/2010/08/data-journalist-david-mccandless.html
http://datamining.typepad.com/data_mining/2010/01/the-mathematics-of-modern-war.html
The following links have been useful so far to me:
http://www.the-data-mine.com/bin/view/Software/MostPopularDataMiningSoftware
http://www.thetweetographer.com/
http://datamining.typepad.com/data_mining/2010/08/data-journalist-david-mccandless.html
http://datamining.typepad.com/data_mining/2010/01/the-mathematics-of-modern-war.html
Had a great meeting with my Lecturer yesterday and had a eureka moment! I have reached a decision on a research question:
Is it possible to predict the likelihood of Scottish Independence by mining micro blogging?
I have watched a few (lots) videos about the different types of open source text mining software and I think that I will be using Rapid Miner although i haven't fully tested this out, I will probably dabble with a few others before i decide for sure.
Now that I have my research question I can start on my proposal, 20 pages in a week? My lecturer would like to see proposal before I submit.
Head down time, eat sleep and live at my desk for the next few days, no time for bathing and kids will have to look after themselves!
On with the search to full fill my literature review requirements....
Is it possible to predict the likelihood of Scottish Independence by mining micro blogging?
I have watched a few (lots) videos about the different types of open source text mining software and I think that I will be using Rapid Miner although i haven't fully tested this out, I will probably dabble with a few others before i decide for sure.
Now that I have my research question I can start on my proposal, 20 pages in a week? My lecturer would like to see proposal before I submit.
Head down time, eat sleep and live at my desk for the next few days, no time for bathing and kids will have to look after themselves!
On with the search to full fill my literature review requirements....
Saturday, 28 January 2012
So my topic is to be Open Source Intelligence Analysis using R and text mining....
What is the current mood say in Iran, at this present moment? Intelligence agencies might benefit from a tool that can collect information intelligently to support government or military decision making. The Blog sphere, Twitter and many other Social Networking sites could example provide the required information.
I received a couple of white papers with examples on how terrorists use these methods for communication and information gathering which I read and quite eye opening, the use of smart phones and only a certain amount character status updates is very significant. I will have to research more into other similar situations. This investigation will be independent, although guidance from my Lecturer if required.
My next stage (today) is to fully understand how R works and the process of data/text mining as I have had no experience so far, I will be posting any useful links that i found helpful, and to see if there is an alternative open source tool to R as it seems a bit outdated and hard to navigate around.
I have to consider a question for my thesis and an area for investigation, my lecturer suggested the Question time twitter feed as there is a lot of opinion on there and might be worthwhile trying to mine that. (Once i get to grips with what mining actually does!)
Hoping that when I meet with Les (Lecturer) that i will feel more confident about which angle to take on this.
What is the current mood say in Iran, at this present moment? Intelligence agencies might benefit from a tool that can collect information intelligently to support government or military decision making. The Blog sphere, Twitter and many other Social Networking sites could example provide the required information.
I received a couple of white papers with examples on how terrorists use these methods for communication and information gathering which I read and quite eye opening, the use of smart phones and only a certain amount character status updates is very significant. I will have to research more into other similar situations. This investigation will be independent, although guidance from my Lecturer if required.
My next stage (today) is to fully understand how R works and the process of data/text mining as I have had no experience so far, I will be posting any useful links that i found helpful, and to see if there is an alternative open source tool to R as it seems a bit outdated and hard to navigate around.
I have to consider a question for my thesis and an area for investigation, my lecturer suggested the Question time twitter feed as there is a lot of opinion on there and might be worthwhile trying to mine that. (Once i get to grips with what mining actually does!)
Hoping that when I meet with Les (Lecturer) that i will feel more confident about which angle to take on this.
Subscribe to:
Posts (Atom)