Download Citespace Manual Free

  1. What Is Citespace
  2. Download Citespace Manual Free Printable

One of the hallmarks of the current era is the availability of a wide assortment of scientific research in the form of peer-reviewed scientific literature. However, while the world has shrunk thanks to the almost global online connectivity, the expansion of the corpus of scientific literature is at such scales that the indices covering citations are often unable to keep up as noted by Larsen and von Ins (2010). Everyday, numerous research papers are submitted, peer-reviewed, and some, published. In this continually explanding digital universe, it can be quite intimidating for researchers to keep up with and locate trends and hot topics in peer-reviewed work. Understanding citations, authorship patterns, and more are topics of general interest of every research community, in general, and the Complex Adaptive Systems Modeling Community, in particular.

CiteSpace Chen (2006) has established itself as an excellent tool allowing researchers to identify key patterns in the dissemination and spread of scientific information. The tool uses various innovative techniques and algorithms for information visualization Jin-xia (2011), exploration Wei et al. (2015), and conducting visual surveys Niazi and Hussain (2011). While, there is an existing supporting website for the tool, the CiteSpace community was really looking forward to a comprehensive book on the topic. As such, Prof. Chen’s book on CiteSpace Chen (2016) is a very welcome addition.

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Internet Download Manager Free to try VIEW →. Download files from the Web and organize and manage your downloads.

PMID: 16779135
This article has been cited by other articles in PMC.

Abstract

This article presents a description and case study of CiteSpace II, a Javaapplication which supports visual exploration with knowledge discoveryin bibliographic databases. Highly cited and pivotal documents, areasof specialization within a knowledge domain, and emergence of researchtopics are visually mapped through a progressive knowledge domainvisualization approach to detecting and visualizing trends and patternsin scientific literature. The test case in this study is progressiveknowledge domain visualization of the field of medical informatics. Datasetsbased on publications from twelve journals in the medical informaticsfield covering the time period from 1964–2004 were extractedfrom PubMed and Web of Science (WOS) and developed as testbedsfor evaluation of the CiteSpace system. Two resulting document-termco-citation and MeSH term co-occurrence visualizations are qualitativelyevaluated for identification of pivotal documents, areas of specialization, andresearch trends. Practical applications in bio-medical researchsettings are discussed.

INTRODUCTION

The scientific literature has been estimated to grow at a rate of 6% peryear [,2]. Record counts collected from the PubMed database shows a fifty-percentincrease in the number of records indexed by year of publicationover the past fifteen years (Figure 1). With this growth rate in scientific literature come ever increasingchallenges for investigators and clinicians to become acquainted withthe core literature of their field, conduct literature reviews, keep abreastof a field, and search for relevant documents. This growth of theliterature is reflected in the concomitant growth in the size and complexityof bibliographic databases.

Number of PubMed Records by Year of Publication 1990 – 2004

We feel that there are strong parallels between bibliographic databasesand clinical data warehouses, and that citation data is suitable fora Knowledge Discovery in Databases (KDD) approach that uses specializeddata mining tools. The KDD approach to data analysis is usually a retrospectiveanalysis of data and does not involve consideration of experimentaldesign and related concepts []. KDD has been defined as the automated or convenient extractionof patterns representing knowledge explicitly stored in large databases, datawarehouses, or other large repositories. The process of evaluatingdata, analyzing patterns, and extracting knowledge is analogousto the sorting, cleaning, and grading process involved in mining minerals [4]. The knowledge discovery process is applied to explain existingdata, make predictions or classifications, or summarize contents oflarge databases to support decision making [].

THE CiteSpace II APPLICATION

This article presents a description and case study of CiteSpace II, a Javaapplication which combines information visualization methods, bibliometrics, anddata mining algorithms in an interactive visualizationtool for extraction of patterns in citation data. A pilot study [] of medical informatics applied document co-citation analysis (DCA) combinedwith Pathfinder Network Scaling (PFNET), visualization, andanimation to develop a 3-dimensional (3-D) knowledge landscape toa limited dataset based on AMIA publications. Animated 3-D models vividlydepicted the growth of the field, but they were cognitively demanding. CiteSpaceII incorporates substantial changes since our previousreport. Due to space limitations, a brief summary of the theoreticaland methodological basis on which CiteSpace II was developed is presentedhere. Detailed reports can be found in Chen, 2004 and Chen, 2005 [, 8].

The primary goal of CiteSpace II is to facilitate the analysis of emergingtrends in a knowledge domain. Knowledge domains are modeled and visualized as a time-variant dualitybetween two fundamental concepts in information science – research fronts and intellectual bases. The concept of a research front was originally introduced by Price []. In a given field, a research front refers to the body of articlesthat scientists actively cite. Persson [9] made a distinction between a research front and an intellectualbase (p. 31): “In bibliometric terms, the citing articles forma research front, and the cited articles constitute an intellectualbase.”

New features of CiteSpace II are related to three central concepts: 1) Kleinberg’s burst detection algorithm is adapted to identify emergent research front concepts [10], 2) Freeman’s betweenness centrality metric is used to highlight potential pivotal points [11], and 3) heterogeneous networks. A knowledge domain is conceptualizedas a mapping function between a research front and its intellectualbase. This mapping function provides the basis of a conceptual frameworkto address three practical issues: 1) identifying the natureof a research front, 2) labeling a specialty, and 3) detecting emergingtrends and abrupt changes in a timely manner. CiteSpace collects n-grams, or single words or phrases of up to four words, from titles, abstracts, descriptors, andidentifiers of citing articles in a dataset. Researchfront terms are determined by the sharp growth rate of their frequencies. Twocomplementary views for analyzing and visualizing 2-D co-citationnetworks are designed and implemented: cluster views and time-zoneviews. The new methods in CiteSpace II have improved the clarityand interpretability of visualizations so as to reduce the user’scognitive burden as they search for trends and pivotal points in aknowledge structure.

The CiteSpace II application has two major interface components. The firstcomponent is used for designating the data and analysis parameters, andis shown in Figure 2. The primary source of data for CiteSpace analysis is the Web of Sciencefrom which data must be downloaded prior to using CiteSpace. CiteSpaceII also allows users to download citation data directly from PubMed. Researchfront terms are extracted by first running the Burst Detectionoption. Users specify the range of years to be analyzed a time, thelength of time slices within the time interval; and three sets of thresholdlevels for citation counts, co-citation counts, and co-citationcoefficients (c, cc, ccv). The specified thresholds are applied to the earliest, middle, and lasttime slice. Linear interpolated thresholds are assigned to the restof slices. Network pruning, merging, and layout options are also setby users. The second interface component allows users to interact withand manipulate the visualization of a knowledge domain in several ways. Visualattributes of the display as well as a variety of parametersused by the underlying layout algorithms can be adjusted. Figure 3 illustrates a zoomed view of an author co-citation cluster that has beenmarked with marquee selection, and the resulting display of associatedMeSH headings and retrieval of related article abstracts from PubMed.

CiteSpace II Interface for Configuring Analysis

CiteSpace Visualization Interface.

METHODS

Two new datasets for analysis of Medical Informatics were developed asa testbed for CiteSpace II. The Institute for Scientific Information’s (ISI) JournalCitation Reports list of medical informatics journalsfor 2003 was cross-referenced against a list of medical informaticsjournals from AMIA [12]. The twelve journals that both resources had identified as importantor relevant to medical informatics were selected for study. Thesetwelve journals were also checked against the NCBI journals databasefor publication history, and the journals which were predecessors ofsome of the current journals were identified. Citation data was exportedfrom Web of Science, and a query was run against the PubMed databasefrom within CiteSpace. Because ISI has indexed meeting abstracts underjournal names instead of conference proceeding names, meeting abstractswere excluded from the WOS data. This resulted in a WOS datasetof 11,952 citation records covering forty years from 1964–2004 andthe closely equivalent time period and journals dataset of 13,369 recordsfrom PubMed (Table 1). The datasets cover a larger period of time than Morris and McCain’s 1998 journalco-citation study, and match on nine of the twentyjournals from that study which covered the indexing period January 1993–July 1995.

Table 1

ISI Full Journal TitleJCR 2003 Impact FactorJCR 2003 I. F. RankYears Indexed in PubMedRecords In PubMed DatasetYears Indexed in WOSRecords in WOS Dataset
Artificial Intelligence In Medicine1.22261993 -4911992 -623
Cin-Computers Informatics Nursing (1)0.217191983 -7781992 -249
Computer Methods And Programs In Biomedicine (2)0.724141971-21221975 -2063
IEEE Transactions On Information Technology In Biomedicine1.27451997 -3042000 -210
International Journal Of Medical Informatics (3)1.17881970 -19531975 -1757
International Journal Of Technology Assessment In Health Care0.754121985-13701995 -742
Journal Of The American Medical Informatics Association (4)2.5111994-7361994 -1674*
Journal Of Biomedical Informatics (5)0.855111967 -15841968 -1555
M D Computing0.500171984–2/20018361984 –02/2001500*
Medical Decision Making1.71831981-11641983 –871*
Medical Informatics And The Internet In Medicine0.915101999 -13401/1999 -136
Methods Of Information In Medicine1.41741965 -18971964 -1572*
Total1965–2004133691964–200411952
2Continues Computer Programs in Biomedicine;
3Continues International Journal of Bio Medical Computing;
4WOS has AMIA Symposium Proceedings 1994 – 2002 indexed as supplementto JAMIA;
*Meeting abstracts excluded.

RESULTS

Due to the limited space, only the major findings from two examples ofthe visualizations produced with CiteSpace II are described: a clusterview (Figure 4) and a time-zone view (Figure 5). Table 2 shows the visualization parameters, and the system used was a 1600MHzPentium notebook with 1 GB RAM. The Burst Detection process completedrunning on each dataset in two to three minutes. The visualization ineach figure was generated in less than one minute. The following interpretationsby two of the authors of this article are based on their ownexperience and knowledge of medical informatics. The visualizationsare qualitatively evaluated for identification of pivotal documents, areasof specialization, and research trends.

Cluster view of Medical Informatics 2000 – 2004.

Time-zone view of Medical Informatics 1990 – 2004

Table 2

ViewCluster (Figure 4)Time-Zone (Figure 5)
Data SourcePubMedWOS
Analysis TypeMeSH Term Co-occurrenceDocument-Term Co-citation
Publication Years2000–20041990–2004
Slice1 year5 years
ModelingCosine, within slicesCosine, within slices
Thresholding (c/cc/ccv)5/3/257/3/30
PruningPathfinderNone
LayoutMergedTime-Zone, Merged
Burst Terms11,1379,869
Document/Term Space*9,066136,469
Nodes & Links151 & 148212 & 279
Run Time (milliseconds)35,96142,581

The cluster view (Figure 4) provides an overview of research areas within the field of medical informaticsduring the years from 2000 to 2004. In this visualization thenode size represents the overall frequency of occurrence of keywordterms and the colored rings of the nodes represents yearly time-slices. Atrail of several pink rimmed nodes (those with a high measure of “betweennesscentrality”) highlights a transition fromthe early decrease in “technology assessment” to the growththen decrease in “administration amp(&) organization” tothe recent increase in the frequency of the term “methods”. Incomparison to previous journal co-citation multidimensionalscaling displays [], the specialties are automatically labeled at the level of detailof MeSH headings and keyword terms as opposed to manual assignmentof labels at the level of clusters of journals. This affords insightinto the structure of a knowledge domain without requiring prior domainor journal knowledge, but does still require conceptualizing labelsfor clusters of terms. The time-slicing feature of CiteSpace also providesinformation on the relative activity of research areas within timeperiods.

The time-zone view (Figure 5) adds additional insights by mapping the highly cited and pivotal documentsthat constitute the knowledge base of medical informatics and thetiming of emergence of new topics. Figure 5 depicts the evolution of themes that could be considered central to medicalinformatics research and practice over time. There are a numberof particularly prominent themes, such as ROC curve analysis and decisionmaking in the early 1990s, giving way to practice guidelines and patientsafety by the turn of the century. Concomitantly, there is a shiftin the centrality of certain authors, that largely parallels the focalareas, and this is to be expected.

DISCUSSION AND CONCLUSION

CiteSpace II is a system that could be potentially used by a wide rangeof users, notably scientists, clinicians, science policy researchers, andmedical librarians. For example, clinical researchers would findCiteSpace II particularly useful in creating domain-specific ontologiesfor use in developing evidence-based knowledge bases for decision support. Informationscientists and librarians would find it indispensablefor tracking the growth of new areas, virtually in real-time, whichin turn could aid in collection development. However, there are severallimitations to using CiteSpace II, the most important of which is thelearning curve required to set accurate visualization parameters. Inaddition, some maps and clusters may be highly complex, requiring specializeddomain knowledge for interpretation. Even with these limitationsin mind, CiteSpace II should prove to be a very valuable tool fora variety of users.

Footnotes

Notes. CiteSpace II is available for download from: http://cluster.cis.drexel.edu/~cchen/citespace.

Manual

References

What Is Citespace

1. Price DD. Networks of scientific papers. Science. 1965 Jul 30;149:510–5. [PubMed] [Google Scholar]
2. Fernández-Cano A, Torralbo M, Vallejo M. Reconsidering Price's model of scientific growth: An overview. Scientometrics. 2004 Jan;61(3):301–321.[Google Scholar]
3. Smyth P. Data mining: data analysis on a grand scale? Stat Methods Med Res. 2000 Aug;9(4):309–27. Review. [PubMed] [Google Scholar]
4. Han J, Kamber J. Data Mining: Concepts and techniques. San Francisco:MorganKaufmann Publishers; 2001.
5. Babic A. Knowledge discovery for advanced clinical data management and analysis. Stud Health Technol Inform. 1999;68:409–13. Review. [PubMed] [Google Scholar]
6. Synnestvedt M, Chen C. Visualizing AMIA : a medical informatics knowledge domain analysis. AMIA Annu Symp Proc. 2003:1024.[PMC free article] [PubMed] [Google Scholar]
7. Chen C. Searching for intellectual turning points: progressive knowledge domainvisualization. Proc Natl Acad Sci U S A. 2004 Apr 6;101(Suppl 1):5303–10.[PMC free article] [PubMed] [Google Scholar]

Download Citespace Manual Free Printable