Mind Map Generator Software Model with Text Mining ... - IEEE Xplore

84 downloads 72713 Views 329KB Size Report
Mind Map Generator Software Model with Text Mining Algorithm ... information, we need good search engines. Using mind maps ... ranking system [20]. There is ...
Mind Map Generator Software Model with Text Mining Algorithm Robert Kudeli1, Mladen Konecki1, and Mirko Malekovi1 1 Faculty of Organization and Informatics, University of Zagreb, Pavlinska 2, 42000 Varaždin, Croatia E-mail(s): [email protected], [email protected], [email protected]

Abstract. A mind map is a diagram that represents ideas, words, items linked and arranged around a central key word or idea. It is widely used to help with studying, organizing information, solving problems and making decisions. There are many “tools” that aid in making mind maps but those tools are just mind map editors. This article describes our solution of mind map generator that generates mind maps from text. We will point out what the key features are that this software type should have, point out the problems that occur and propose solutions for them. Also, we will describe a text-mining algorithm that we’ve developed and give examples of its runtime behavior.

Keywords. mind map, generator, text analysis, text mining, model, software 1. Introduction and related work A mind map is a diagram used to represent words, ideas, tasks, or other items linked to and arranged around a central key word or idea. Mind maps are used to generate, visualize, structure, and classify ideas, and as an aid to studying and organizing information, solving problems, making decisions, and writing. We can see that there are numerous applications of mind maps in many areas. There are also many research articles about mind maps and their applications. One of the main purposes of mind maps is to aid in education, to organize knowledge in a structured way. A mind map can be used as a teaching resource [9]. It can be used in education of specific group of students [7]. Research has shown that cognitive structures of knowledge are better in learning with mind maps then traditional way [8]. There has been also research on teachers and the results showed that mind maps are good aid in oral lessons [15]. We can say that mind maps are very useful in the field of education. It’s much easier to understand

well structured data instead of unstructured. Mind maps can be used as a tool to model semi structured documents [4], to organize data in a more intuitive way. There are many areas where mind maps can help us. Today, it’s not a problem to get information, since the Internet is a huge information resource. But to get high quality information, we need good search engines. Using mind maps, we can make expert search, document summarization [3] and speed up search process and get more relevant information [2]. Using mind maps can help us filter search results in a better way, better than traditional page ranking system [20]. There is also research in the area of cognitive functions. There is a research where use of cognitive mind maps can help us in fostering trust [13]. Also there is a research where mind maps are used in creating conceptual design, to fully develop designers’ potential [11], to help in keeping the balance of science and arts, as well as logical and imaginary thinking. Mind maps can be used to better understand and analyze conversation streams [14]. So there are many possibilities of application of mind maps in this field. Mind maps can be used in the process of organizing and planning. Some researchers recommend using mind maps in analysis and egovernment design [18]. With the use of mind maps it’s easier to get a broad overall view, focus on the details, get a better understanding of implementation, etc. Another article shows how the use of mind maps can help us in health service to make qualitative data analysis where time is crucial [5]. There are also some very interesting articles on use of mind maps, like algorithm that can help us generate ideas using mind maps [12]. There is also a suggestion for a new model of mind maps called mind maps of the next generation [17]. This model would bring mind maps with association, back-tracking, comparison and

487 rd

Proceedings of the ITI 2011 33 Int. Conf. on Information Technology Interfaces, June 27-30, 2011, Cavtat, Croatia

cognitive functionality together with new way of connecting elements of mind maps. There are many mind map tools that can help us make a mind map, such as Compendium, FreeMind, Freeplane, SciPlore MindMapping, Cacoo, Inspiration, MindGenius, MindMapper, Mindomo, NovaMind, Semantica, Visual Mind etc. But there are no tools that can generate mind maps from text. In order to create a mind map, we have to know how to make one, read the text, create the mind map and all the help we have are editing tools for drawing mind maps. These tools don’t differentiate from some other diagram editing tools. The idea to automatically generate mind maps from row documents is not new [1]. However, the integration of automatic mind map generation feature with standard manual mind map creation software possibilities is rather new and it opens wide research possibilities in the field of artificial intelligence, text mining, machine learning et cetera. Also, let’s not forget use of mind maps in knowledge management where it is critical to rapidly discover, interpret, share and reason over the data [19] [16]. In this article, we will describe our idea of the mind map generator software. We will point out the key features and functionality that this kind of software should have. We will also point out the main problems we have to deal with and propose adequate solutions for them. Also, we will describe problems that occur in our case when we are text mining through our custom developed algorithm and give a few examples of its runtime execution.

hour, two or three hours, based on the [10]. How do we transfer knowledge from short-term memory to long-term memory? We will describe a model of a mind map generator software. This system can help us get relevant information from unstructured and semistructured data much quicker. It can also help in the learning process; make it easier and more intuitive. With generated mind maps you need to process less information in a shorter time frame. With mind map database we can integrate knowledge in various fields. It will help in searching big documents quickly and efficiently. This system will be able to represent data, information or knowledge in a new way that is easier to comprehend.

Figure 1. How much do we remember

2.2. Visual appearance

2. Model description 2.1. Problem description As we’ve seen in the introduction, there are many areas where we can use mind maps, but let’s look at the problems that are present today. Firstly, there is so much data/information to process that it’s nearly impossible to do it. Internet is a database with an enormous amount of information. Secondly, we have a limited amount of time to do it. Therefore we have to find a way to process information in a more efficient way. We need to structure data in a timely fashionable manner. Sometimes it’s hard to understand and comprehend the problem we’re dealing with. Another problem is that it’s hard to remember all the data we need in the future. Fig. 1 shows what the percentage of data we remember after an

Figure 2. Example of mind map

Tony Buzan, the author of the book on mind maps recommends what features mind maps should have [6]. We will mention those related to our work:

488

• •

• • •

Use words, pictures, colors and dimensions throughout your mind map Lines should connect a whole mind map, central lines are thicker and they get thinner as they radiate from the centre. Each word should have its own picture Develop your own style of mind mapping Keep the mind map clear by using radial hierarchy

Based on these features we can determine how our generated mind map should look like. The map will be generated from its central word. Based on the number of terms that are directly connected to this central word, we will evenly make radial links to those terms. All the other levels of the mind map won’t follow the same rule. Every new level that connects to core terms of a mind map must develop in the direction away from the central point so we don’t get the case where word that’s on the 2nd level is closer to the central word than the word on the 1st level. The lines will be in different colors so that each category on the 1st level will have its own color. Each word will also have its own unique picture; hence the picture in the center will be the largest, while other pictures will become smaller as they move away from the center. The same rule applies to words and lines, so the ones on the lower levels will be smaller and thinner. With this combination of words, pictures, colors, their sizes and thicknesses, we create much stronger cognitive structures of knowledge [6].

2.3. System architecture and functionality This software should work on home computers, laptops, PDAs and even cell phones. Home computers and laptops are powerful enough to execute mind map generation algorithms, however the execution on PDAs and mobile phones would be somewhat slow. Therefore we recommend the system to be based on SOA, using web service. PDAs and mobile phones can send input data, web page link, to a web service. The web service will then execute the algorithms and generate a mind map and send the generated mind map back to users. There is another reason why we are going to use web

service and database server. We will store all generated mind maps on our database server so each user can search and download all the mind maps created. When somebody generates a mind map, if there are mind maps around the same word in our database, we can recommend him those mind maps. After a period of time, we will have a big database with lots of mind maps and we can then use it for mind map integration and further research. Fig. 3 shows how we’re planning to implement this system.

Figure 3. System architecture

Considering that pictures are a very important part of mind maps, we will use our server architecture to create a big database of pictures that are suitable for each word. We will make initial database with pictures for most common words, so the system will automatically place the pictures in the generated mind maps. For words without a picture in our database, users will be able to upload their own pictures. After a period of time, we will again have multiple suitable pictures for each of the words; therefore the users will be able to choose pictures they like for each word in their mind maps. They will always have an option of changing pictures of the given keywords because every person has different associations and it is very likely that not all automatically selected pictures will give appropriate associations for every user. More importantly, every user will probably always change at least a few pictures to adapt the generated mind map associations to his/her liking. Earlier in the text we’ve mentioned it was good for users to develop their own styles of mind mapping. Therefore we will provide editing features which will make it possible for the users to edit their mind maps. They will be able to

489

rearrange nodes, change colors, pictures and words. They will be able to fully customize their mind maps. The most important part of this software will be the algorithms that actually generate mind maps. Those algorithms should, among other things, analyze given text through relations between words in sentences, relations between sentences and between paragraphs, parsing of words that are irrelevant and based on those results, etc. After this analysis we can automatically generate mind map for a given text. If we look at this problem, we can’t just look headers of documents or some meta-data. This kind of algorithmic analysis would be a good starting point but for entirely plain texts it would not work. The algorithms should give good results for any given text, and we can’t predict in which form the text will be. The details about these algorithms will be presented in upcoming papers since this is the paper that describes the basic model of this system and algorithms for text-mining. When a mind map is generated, the user will be able to browse it, he will be able to search any term that was in the original text, and the search results will automatically mark nodes that are related with the search term. Also, when a user clicks on a particular node, he will see only the text relevant to that particular node. These are the main features that we consider important for a mind map generator to have. If we want to generate a mind map from a text the first thing we need is an algorithm that will parse this text. In the following section we will give a few examples of this parser at work.

3. Text-mining algorithm For every text that we want to generate a mind map from we need an algorithm that will parse the text and extract useful structured data. We need a text source as an input (document, web page) and this algorithm will give us extracted text for mind map generation as an output. The extracted text will be an input for the algorithm that will actually generate a mind map. If we have a pdf, doc, rtf or a document of another similar extension, there are no problems in text extraction since all we need to do is just take the entire content of the document. But, if we have a web page as a text source then we have a problem, since, obviously the needed text has to be extracted from the web code and here we need somewhat smarter algorithm that will do

this extraction. If a web page is coded according to W3C standards the only thing that has to be done is to write a small algorithm with just a few regular expressions since it is not so difficult to find patterns in the code. There are also pages that do not conform to these laid out standards, and in order to resolve this situation we need a more complex special algorithm which will try the best it can to extract the text. The reason why it’s difficult to extract this text from a web page that does not conform to W3C standard is that these pages are poorly structured; the tags are badly encapsulated, unclosed, etc. Let’s first take a look at the pseudo code for this algorithm that we have developed and after that we will give some examples of inputs and outputs of the mentioned algorithm at runtime. Pseudo code: Input: Text from the following data sources (doc, docx, pdf, txt, rtf, web-page) Output: extracted text IF (data source = doc || docx || pdf || txt || rtf) Extract text data from the source document through standard algorithms for document manipulation ELSE Determine if web page conforms to W3C standards IF (conforms) Execute regular expressions to extract data from data source according to standard webpage structure laid out by the W3C ELSE Determine position of the text in the code through global code structure Refine text positioning through selective tag encapsulation Extract text from the code Now that we know how this algorithm works let’s see few examples of its execution. Extraction from documents will not be covered since this task is trivial but we will, of course, cover text extraction from web pages. Let’s see a few examples of text extraction from web pages that do not conform to W3C standards. Data source: http://edition.cnn.com/2011/OPINION/02/10/opi nion.roundup.egypt/index.html?hpt=C2 loaded: 11. 02. 2011. W3C markup validation: 67 errors and 22 warnings. Results: Everything relevant was extracted from the web page, therefore, this web data source was

490

extracted satisfactory. Nevertheless, algorithm did make one mistake because it extracted a sentence that is not a part of the main text. Overall, these are good results since this mistake is a minor problem and probably won’t influence the mind map generator that will be developed. Data source: http://en.wikipedia.org/wiki/Mind_map loaded: 11. 02. 2011. W3C markup validation: 2 errors Results: The text was satisfyingly extracted but there was a minor issue. Item list with mind map creation steps was excluded from the extracted text. After the code analysis we have concluded that the code structure and tags encapsulation was the reason this happened. In spite of this error it is arguable if this can be avoided. If we take list itemization from anywhere in the code, a lot of unnecessary text would have been extracted as well. Overall, these are good results since this mistake is a minor problem and probably won’t influence the mind map generator that will be developed. Data source: http://www.formula1.com/news/headlines/2011/ 2/11732.html loaded: 11. 02. 2011. W3C markup validation: 63 errors, 29 warnings Results: The text was satisfyingly extracted with a minor issue. Namely, two links were added to extracted text when they should not have been. But this is just a minor issue and it probably won’t influence the mind map generator that will be developed. Data source: http://www.ieee.org/conferences_events/confere nces/conferencedetails/index.html?Conf_ID=188 14 loaded: 11. 02. 2011. W3C markup validation: 29 errors, 28 warnings Results: The text was satisfyingly extracted with minor issues. Due to bad tag encapsulation some links and copyright statement were extracted but are unnecessary. Whether or not this is going to be an issue with the mind map generator is still uncertain, but we will investigate this occurrence when we fully develop the mind map generator algorithm. Data source: http://www.scopus.com/record/display.url?eid=2 -s2.0-78650513002&origin=resultslist&sort=plff&src=s&sid=rGmYFrh19Fvdc98pFMnlKmW:5 0&sot=q&sdt=b&sl=28&s=TITLE-ABS-KEYAUTH(mind+map)&relpos=0&relpos=0&search Term=TITLE-ABS-KEY-AUTH(mind map)

loaded: 11. 02. 2011. W3C markup validation: 9 errors, 10 warnings Results: The text was satisfyingly extracted with minor issues. Copyright statement was also extracted and added to the text. Overall, those are good results since this mistake is a minor problem and probably won’t influence the mind map generator that will be developed. Data source: http://www.thehindu.com/news/international/arti cle1409933.ece loaded: 11. 02. 2011. W3C markup validation: 88 errors, 34 warnings Results: The text was satisfyingly extracted with just one issue. One link was added to the extracted text but this probably won’t influence the mind map generator algorithm. Data source: http://www.bbc.co.uk/news/business-12427680 loaded: 11. 02. 2011. W3C markup validation: 3 errors Results: This web page was parsed almost perfectly and with a good reason. It only had three errors according to W3C standardization criteria. Overall, those are good results and it is very unlikely that this text would mislead mind map generator in any way. Data source: http://classweb.gmu.edu/biologyresources/writin gguide/ScientificPaper.htm loaded: 11. 02. 2011. W3C markup validation: 51 errors, 1 warning Results: The text was satisfyingly extracted with no issues. Therefore, the mind map generator should not have any problems generating mind map from this text. As seen from these results when a web page conforms to W3C standards or has a small number of irregularities, the text-mining algorithm has no problems determining what text is needed for analysis. But when there are greater numbers of errors, we have encountered problems. It is very important to understand and keep in mind that this algorithm will probably never be perfect considering that even if we have a web page that is 100% W3C standardized there are no guarantees that the needed text is going to be well structured within the code. Therefore, as long as we are getting more or less good results with only minor issues, the mind map generator can probably be developed so that these irregularities are mostly filtered out, or at least, don’t influence the mind map generator algorithm too much.

491

4. Conclusion and further research Mind maps are very useful in different fields, like learning, memorizing, structuring data and speeding up the search process; we could process much more information in less time etc. The process of creating a mind map is slow, and all the tools today are just editors that help us create mind maps. So if we generate mind maps from plain text, that reduces much of the time required to make mind maps and then we could focus on using them. We plan to develop a system that is based on this model, with all the features that we described. Further research is needed to implement all the features of this model. The developed text-mining algorithm will be a part of this system and outputs of this algorithm will be passed to a mind-map generator algorithm that is still in development. Development of such a system would contribute to the popularization of mind maps and we believe it would greatly increase the use of mind maps in many areas. Also, this compact way of presenting data with mind maps and then creating information and new knowledge from them is very useful in knowledge management where we could even implement some kind of reasoning over these maps and information that we created.

5. References [1] Abdeen M, El-Sahan R, Ismaeil A, ElHarouny S. Direct automatic generation of mind maps from text with M2Gen. IEEE Toronto International Conference Science and Technology for Humanity, 2009, p. 95 [2] Beel J, Gipp B. Enhancing search applications by utilizing mind maps. Proceedings of the 21st ACM Conference on Hypertext and Hypermedia 2010, p. 303-304 [3] Beel J, Gipp B, Stiller J-O. Information retrieval on mind maps – what could it be good for?. Collaborative Computing: Networking, Applications and Worksharing, 2009. [4] Bia A, Munoz R, Gomez J. Using mind maps to model semistructured documents. Lecture Notes in Computer Science, Volume 6273, 2010, p. 421-424 [5] Burgess-Allen J, Owen-Smith V. Using mind mapping techniques for rapid qualitative data analysis in public participation processes. Health Expectations, Volume 13, Issue 4, 2010, p. 406-415

[6] Buzan, Tony.: The Mind Map Book, Penguin Books; 2006. [7] Chia-Chia Lin Dong-Her Shih. Mind Mapping: A Creative Development in Industrial Engineering Education. Wireless Communications, Networking and Mobile Computing 2009. [8] Dhindsa HS, Makarimi-Kasim, Roger Anderson O. Constructivist-Visual Mind Map Teaching Approach and the Quality of Students’ Cognitive Structures. Journal of Science Education and Technology 2010, p. 1-15 [9] Edwards S, Cooper N. Mind Mapping as a Teaching Resource. Clinical Teacher, Volume 7, Issue 4, December 2010, p. 236239 [10] Goldstein E.B.: Cognitive psychology: connecting mind, research, and everyday experience. Wadsworth, Cengage Learning, 2008. [11] Jianxin C. The using of mind map in concept design. 9th International Conference on Computer-Aided Industrial Design and Conceptual Design, 2008, p. 1034 [12] Jogalekar UA, Mangla A. Idea generation algorithm based systems. 3rd IEEE International Conference on Computer Science and Information Technology, Volume 7, 2010, p. 14-17 [13] Poray J, Schommer C. A Cognitive MindMap Framework to Foster Trust. 5th International Conference on Natural Computation, 2009, p. 3 [14] Poray J, Schommer C. Managing conversational streams by explorative mindmaps. ACS/IEEE International Conference on Computer Systems and Applications, 2010, Article number 5587033 [15] Seyihoglu A, Kartal A. The views of the teachers about the mind mapping technique in the elementary Life Science and Social Studies lessons based on the constructivist method. Kuram ve Uygulamada Egitim Bilimleri, Volume 10, Issue 3, 2010, p. 1637-1656 [16] Völkel, M., Haller, H. Conceptual data structures for personal knowledge management. Online Information Review, 2009, p. 298 [17] Wang S, Wang L. Mindmap-NG: A novel framework for modeling effective thinking. 3rd International Conference on Computer Science and Information Technology, Volume 2, 2010, p. 480-483

492

[18] Wu Ze-jun Wang Xin-an Wu Yun. Egovernment System Demand Analysis Based on Mind Map. International Conference on Networking and Digital Society, 2009, p. 254 [19] Zhang, F. The application of visualization technology on knowledge management. International Conference on Intelligent Computation Technology and Automation, 2008, p. 767 [20] Zualkernan IA, AbuJayyab MA, Ghanam YA. An Alignment Equation for Using Mind Maps to Filter Learning Queries from Google. Advanced Learning Technologies, 2006, p. 153

493

494