Software for e-Consultation Corpus Analysis and Representation


Seminar Paper, 2011
28 Pages, Grade: Defended

Excerpt

Table of contents

Software for e-Consultation Corpus Analysis and Representation.

Abstract

e-Democracy and e-Consultation

Participatory Planning

ICTs, Information Overload and Information Processing
CSAV

Software Application
Text Mining Software in e-Consultation
Qualitative Text Analysis Software
Software Functionality Sought

Software Investigated
Semi-Automated Text Mining Software
Manual Text Analysis Software
Text Analysis Software Preferences

Submissions’ Analysis Process & CSAV Mapping
Manual Search Terms Used

CSAV Software
CSAV Software Investigated
CSAV Software Preference

Conclusion

Abstract

The global phenomenon of electronic(e)-governance and the advanced capacity for information generation by information and communication technologies (ICTs) have contributed to the perceived problem of information overload. In participatory democracy and specifically e-democracy and e-consultation, in which a vast quantity and array of textual discourse can be generated, effective and efficient information processing is important. Effective and efficient processing will assist participants to make-sense of and remain engaged in consultations. Accordingly, tools and technologies to assist in the analysis, synthesis and dissemination of such discourse have the potential to make a salient contribution. In this article, a critique of several software packages, consisting of qualitative text analysis, natural language text mining and computer supported argument visualisation software is presented. The use of natural language text mining software with sentiment analysis features was the initial focus of this investigation. However, early in the investigation and after a software trial, natural language text mining software was considered underdeveloped with regard to the specific functionality sought. Hence, the investigation then focused primarily on the utility of computer supported argument visualisation (CSAV) and also text analysis software. For text analysis, Leximancer, Text Analyst Atlas.ti. and TextSTAT were preferred and chosen from among eleven programmes investigated. For CSAV software, a programme called Compendium was preferred and chosen from among twelve programmes investigated

Keywords

Computer Supported Argument Visualisation, Knowledge Cartography, Qualitative Text Analysis, e-Democracy, e-Consultation, Natural Language Text Mining, Wicked Problems

Software for e-Consultation Corpus Analysis and Representation.

Abstract

The global phenomenon of electronic(e)-governance and the advanced capacity for information generation by information and communication technologies (ICTs) have contributed to the perceived problem of information overload. In participatory democracy and specifically e-democracy and e-consultation, in which a vast quantity and array of textual discourse can be generated, effective and efficient information processing is important. Effective and efficient processing will assist participants to make-sense of and remain engaged in consultations. Accordingly, tools and technologies to assist in the analysis, synthesis and dissemination of such discourse have the potential to make a salient contribution. In this article, a critique of several software packages, consisting of qualitative text analysis, natural language text mining and computer supported argument visualisation software is presented. The use of natural language text mining software with sentiment analysis features was the initial focus of this investigation. However, early in the investigation and after a software trial, natural language text mining software was considered underdeveloped with regard to the specific functionality sought. Hence, the investigation then focused primarily on the utility of computer supported argument visualisation (CSAV) and also text analysis software. For text analysis, Leximancer, Text Analyst Atlas.ti. and TextSTAT were preferred and chosen from among eleven programmes investigated. For CSAV software, a programme called Compendium was preferred and chosen from among twelve programmes investigated.

e-Democracy and e-Consultation

The proliferation of ICTs and the impact of the Internet has led to a global phenomenon where increasingly corporate and public institutions are moving to conduct administration and the delivery services and programmes online (Rathee and Rishi 2011). An overarching term for this phenomenon is electronic(e) governance, but the element in which governments engage with citizens on democratic matters, enabled via the use of ICTs and the Internet is known as e-democracy (OECD 2003). E-democracy can be partitioned into two distinct categories, 1) electronic voting, and relevant to this research 2) electronic participation, which provides greater opportunity for civic engagement and citizen consultation in public policy-making (Macintosh 2004). One of the mechanisms being utilised for such participatory democracy is e-consultation. In e-consultation, elected representatives and government agencies use ICTs and the Internet to consult the citizenry on matters of democratic governance. For the purposes of this research, a case study was conducted on a significant public consultation programme in Queensland, Australia. The consultation was initiated by the Queensland State Government for the development of the South-East Queensland (SEQ) Regional Plan. In this case, a draft SEQ Regional Plan was released to the public for feedback. Several mechanisms were used to engage participants, including town meetings, postal and electronic public submissions and an online forum or e-Consultation. This study was scoped to focus on the e-Consultation component of the programme.

Participatory Planning

Such participatory planning processes and consultations around regional, urban and town planning evoke critical questions. How can diverse perspectives from the citizenry, different community groups, planners, and government, consultation discourse and decision rationale be captured, analysed, synthesised and represented? How can the government and public stakeholders of regional planning consultations make sense of such a high volume and diverse discourse? (De Liddo and Buckingham Shum 2007). The high volume and complexity of information generated in consultative democracy can make it difficult for both the public and government to assimilate.

ICTs, Information Overload and Information Processing

The advanced information processing and networked capabilities of ICTs, has led to a state where information overload is a regularly cited problem (Ficco and Karamychev 2004; Baez et al. 2010). Accordingly, information processing and sense-making is of particular relevance in the burgeoning field of e-Democracy and consultative forums, which have the potential to generate large quantities of data and requires accurate and efficient analysis of discourse. Although being a contributor of the overload issue, ICTs may also be part of the solution (Eppler and Mengis 2004). Toward these ends, the Organisation for Economic Co-operation and Development (OECD) posed the question of whether technology can adequately support the analysis and summarisation of democracy consultation text submissions (OECD 2004). Furthermore, Coleman and Norris (2005), and Renton and Macintosh (2004) found that there is a need for research that looks at tools and technologies that can aid in the analysis, synthesis and dissemination of participatory democracy discourse.

CSAV

Computer supported argument visualisation (CSAV) has been found particularly applicable to support the analysis and representation of complex data such as those in consultative design activities in regional, urban and town planning (Conklin et al. 2007; Kirschner et al. 2003; Swedish Morphological Society 2005; Rittel and Webber 1984). This medium can help establish common ground within diversity, understand positions, surface assumptions and collectively construct consensus (Kirshchner et al. 2003). Within this environment, CSAV can therefore function to deliver an enhanced level of democratic transparency.

Software Application

A proposition from the E-democracy European Network project was that natural language processing is likely to be an effective tool for analysis, sorting and classifying communications (Carenini et al. 2007; Whyte and Macintosh 2003). Manual discourse analysis can be a time consuming and expensive undertaking and as participation in public consultation increases, this exercise is likely to become more burdensome. The e-Democracy Unit of the Queensland Government supported this fact. If an organisation does not have the means to analyse communications in an efficient and effective manner, this deficiency can contribute to misinterpretation and misappropriation of communications (Bontis et al. 2003). Both quantitative and qualitative methods of content analysis are needed to aid in making sense of the myriad of voices in public consultations and enable representatives to summarise and explain the deliberative process (Whyte and Macintosh 2003).

Text Mining S oftware in e-Consultation

An obvious advantage of e-Consultation over face-to-face consultation is that the discussion threads are stored and available for qualitative and quantitative analysis (Whyte and Macintosh 2003). Furthermore, consultation forum text of participant discourse is in the participant’s own words as opposed to being paraphrased by a facilitator, which may contribute to a more trustworthy process (Whyte and Macintosh 2003). Thread analysis is a useful feature of an electronic-forum data analysis process, which is used for the analysis of e-consultations in urban planning (Jankowski et al. 1997). Thread analysis can provide both quantitative and qualitative data to aid in assessing, which topics, issues or questions stimulated participants and the extent that particular topics attracted in-depth discussion. The assessment measures can include, quantity of comments posted per thread, the average and total word count per thread and thread depth (i.e. amount of levels of reply, and length of time between first and last contribution) (Whyte and Macintosh 2003).

Qualitative Text Analysis Software

Weitzman (Weitzman 2003) argues that debate regarding opposition to researchers’ use of qualitative data analysis software (QDAS) is based around the perception that software will do the analysis for the researcher or analyst. A counter-argument, which is well supported in the literature, is that it is the researchers’ responsibility to understand his /her chosen research approach and thus, effectively guide and interpret the analysis (Gilbert 2002; Macmillan and Koenig 2004; Morse and Richards 2002; Weitzman 2003; Atherton and Elsmore 2007). Furthermore, Weitzman (2003) argues that analysis software is only a support tool in theory building. Table 1 lists a number of reasons that have been cited for the beneficial use of QDAS.

Table 1 – Benefits of QDAS

illustration not visible in this excerpt

Software Functionality Sought

The ideal functionality for the technology that was sought for the project was: (1) semi-automated analysis; (2) qualitative natural language text mining and analysis, along with (3) quantitative features to enable enhanced data description. Functionality was sought that would enable (3a) filtering, (3b) classifying and (3c) synthesis of each participant's text posting. The technology was to be used to (4) draw out the major concepts from the dialogue text and (5) visually map their interrelationships. Technical functionality was also sought that would enable the (6) highlighting of areas of agreement and disagreement and (7) investigating participant sentiment . Various off-the-shelf software packages provided some of the features sought but no one package provided all of them.

The term text mining refers to a technology that functions to automatically discover patterns and trends in large collections of unstructured text (Uramoto et al. 2004). Text mining functions to assist the organisation and visualisation of text in multiple ways either at the document or text level. These technologies use algorithms to analyse the text from user specified perspectives. Examples of these perspectives are, associations and trends between entity categories such as between researcher names and research topics; medical drugs, drug effects and disease symptoms; consultation participants and discourse topics (Mack et al. 2004).

Software Investigated

The text mining software packages with automation trialled were, Logik (Coredge Software Inc. 2003), Copernic Summariser (Copernic Inc. 2001), AnSWR (Centers for Disease Control and Prevention 2003), CATPAC (Woelfel 1998), TextAnalyst (MicroSystems Co. Ltd. 2003) and Leximancer (McFadden 2003). DB2 Information Integrator OmniFind Edition (IBM 2004), Microsoft Excel, Nvivo (QSR International Pty Ltd 2002) and Atlas.ti (Muhr 2004) were the manual text analysis applications trialled. In addition, a quantitative text analysis software trialled was TextStat (Huning 2007). The CSAV software trialled included Compendium (Bachler et al. 2007), Reason!able (Van Gelder and Bulka 2002), Mind Manager (Mindjet LLC 2007) and Decision Explorer (Banxia Software Ltd 2004). All of the software packages presented in this discussion, can be run in a Microsoft Windows environment on a personal computer.

Semi-Automated Text Mining Software

Logik is a commercial knowledge discovery tool that helps users to access knowledge from within unstructured electronic information. It exposes document content in the form of summaries and key themes from electronic information such as e-mail, files in public folders or local hard drives and documents on the Internet. It enables users to define algorithms for a specific search focus but its features specialise in document search rather than text analysis, which was the focus of this project (Coredge Software Inc. 2003).

Copernic Summariser is a commercial automated text summariser that extracts concepts from inputted documents to provide a document summary or overview. The program processes text using semantic analysers and statistical models attaching a weighting to sentences and displays a summary of content based on its relative importance. Although a useful tool, it does not enable the user control over the analysis process, which was desired for this project (Copernic Inc. 2001).

AnSWR derived from ‘Analysis Software for Word-Based Records’ is a free public domain tool originally developed to assist with managing and analysing large multi-site research studies that integrate qualitative and quantitative techniques. It enables analysts to analyse unstructured or semi-structured textual data, explain how analytical decisions evolved and to examine the validity of the analytical lens used to frame, filter and present information. Features are included for coding and indexing of ideas or themes, ordering codes, and establishing relationships between codes. Text retrieval is aided by the ability to select sub-sets of information using search parameters such as files, codes, coders, segments and attributes of information sources. It provides an audit trail that is useful for explaining how concepts, theories and propositions were developed (Centers for Disease Control and Prevention 2003).

CATPAC is a commercial intelligent program that can read text and summarise its main ideas. CATPAC is fully automated and thus, needs no precoding and makes no linguistic assumptions. Catpac also provides a variety of neural network options and cluster analysis algorithms. A case delimiter can be inserted at the end of each respondent’s text dialogue, thus the program will then treat each delimited text section as a separate case. It was not designed to include a wide range of known text analysis methods but positioned to fill a market niche, which was fully automatic analysis of text without extensive pre coding, independent of any linguistic theory or heuristics. Once again, user control over the text analysis process is limited (Woelfel 1998).

TextAnalyst is a commercial text analysis program that is capable of semi-automated natural language text analysis from arbitrary fields. It processes text, develops and displays clusters and semantic relationships between words and topics within the text. The software analysis displays results by significance, as hyperlinked words that can be clicked to access and view corresponding text sections. This enables efficient navigation of large texts and for comparison within and between texts. The software creates automatic summaries of texts and the text base can be queried for information retrieval using natural language queries (MicroSystems Co. Ltd. 2003). The figure below is an example of the output from an analysis of the SEQ Regional Plan Consultation discourse corpus using TextAnalyst.

Text Analyst Output Example

Abbildung in dieser Leseprobe nicht enthalten

Leximancer is a commercial data-mining tool used for analysing the content of textual documents and displaying the elicited concepts and their interrelationships visually in a clustered map of concepts. It enables the efficient analysis of vast amounts of text. Concepts within a text are displayed in a manner that enables further exploration of their related subtext (Smith 2004a; Smith and Humpreys 2006). Leximancer provides the following sources of information about the content of textual documents.

[...]

Excerpt out of 28 pages

Details

Title
Software for e-Consultation Corpus Analysis and Representation
College
Griffith University  (Griffith University and Qantm College)
Grade
Defended
Author
Year
2011
Pages
28
Catalog Number
V179025
ISBN (eBook)
9783656012740
ISBN (Book)
9783656012566
File size
2822 KB
Language
English
Notes
This is from my Ph.D thesis research but is current (2011).
Tags
software, corpus, analysis, representation
Quote paper
Dr. Ricky Ohl (Author), 2011, Software for e-Consultation Corpus Analysis and Representation, Munich, GRIN Verlag, https://www.grin.com/document/179025

Comments

  • No comments yet.
Read the ebook
Title: Software for e-Consultation Corpus Analysis and Representation


Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free