files be in batches? Add, Edit, Delete Categories How easily are categories added, edited, or deleted? Can categories be
Taxonomy Tools Requirements and Capabilities Joseph A Busch, Project Performance Corporation Zachary R Wahl, Project Performance Corporation
Tools • Taxonomy editing – Data Harmony, Mondeca, MultiTes, PoolParty, protégé, SmartLogic, Synaptica, Top Braid Composer, Wordmap
• Metadata tagging (automated categorization) – CIS, ConceptSearching, Data Harmony, MetaTagger, nStein, Smartlogic, temis
• Content management – Documentum, Drupal, Fat Wire Interwoven, Joomla!, OpenText, SharePoint
Typology of taxonomy tool functions Functional area
Functions
Taxonomy Development
Create a taxonomy User roles and permissions
Taxonomy Maintenance
Add, edit, move, delete items Assign or modify privileges to one or a group of items Activity logging
Taxonomy Governance
Approval workflow for additions and changes
Metadata Controlled Vocabulary
Assign attributes to a category Associate controlled vocabulary with metadata field Thesaurus capabilities
User Interface
Search and browse Drag and drop Multiple windows
Reporting
Alphabetical, hierarchical and other views Visualizations Importing and exporting taxonomies
Application Integration
3 APIs (WSDL, Scripts, Java, etc.) Application integration (CMS, DMS, search engine, etc.)
Advanced
Midrange
Basic
Normal taxonomy editor functionality requirements
Standard and custom fields & attributes Standard and custom relations Data typing and restrictions Consistency enforcement Flexible reporting (exporting) Flexible importing
UNICODE Multiple vocabulary support Inter-vocabulary relations Unique IDs/URIs: externally supplied IDs are not sufficient
Workflow Voting Change Request Mgmt. Stylistic rules enforcement Programmability
Term Editing
Hierarchy Browser
Additional functionality for taxonomy editing software Aliases – Need to deal with Poly-hierarchy – Must be able to synonyms, but also with support multiple parents (as well alternative labels based on as the same string with different language or other factors. meanings). Notes – Useful to have several Inter-category relations – Must be types of notes fields to keep public able to provide links that don’t notes separate from team’s follow hierarchy, and even go working notes. between vocabularies. Effective dates – Enable the Rules checking – Check determination of what was the conformance to style rules like ‘valid’ taxonomy on dates in the length, use of ampersand, etc. past. Part of a set of strong Workflow – Tracking the handling requirements on provenance. of change requests, as well as the URIs – Must be able to support process of getting approvals for the semantic web. edits.
Scenarios to evaluate taxonomy tools Functional area
Functions
Database Definition
How is the database created? Where is it stored? Is it Z39.10 and ISO 2788 compliant? Database license requirement?
Importing/Exporting Data
How are data imported? What file formats are supported? Can data files be in batches?
Add, Edit, Delete Categories
How easily are categories added, edited, or deleted? Can categories be added, edited, or deleted in batches?
Relationship Types
How are relationship types defined? What types are supported? How is polyhierarchy handled?
Add, Edit, Delete Relationships
How easily are relationships added, edited, or deleted? Can relationships be added, edited, or deleted in batches? Does a change propagate to all instances?
Reporting
How does the TMS report: new, edited, deleted taxonomies and categories; new, edited, deleted relationship types and relationships; mapped taxonomies and categories? How are the reports presented? What audit logs are available? Can changes be traced to users who suggested them? Is an “approval” step for changes available for administrators?
User Access
Can the TMS integrate user accounts with existing authentication systems, e.g. Active Directory, etc.? Is there support for role-based access or defined group membership with configurable access? Is there a workflow to approve changes? What functionality is available or6 restricted based on a user’s security privileges?
MultiTes search, browse and edit term UI Search Browse Alpha List
Edit Term
MultiTes add relationships UI
Edit Term
Add Relationships
8
MultiTes strength is report writer Alphabetical Report Hierarchy (Top term) Report
MultiTes ratings Functional area
MultiTes
Database Definition
5
Importing Data
3
Add, Edit, Delete Categories
2
Relationship Types
4
Add, Edit, Delete Relationships
2
Reporting
1
Exporting Data
3
User Access
0 Total Score
20
Average Score
2.5
Strengths • Widely used. • Inexpensive. • Flexible report writer for exporting data.
10
Synaptica term record
Term
Parent
Children
Variations
Description/Scope Note 11
Synaptica tree view
Term
Tree
Description/Scope Note
12
Synaptica visualizations Word Map
Radial Map
13
Synaptica ratings Functional area
Synaptica
Database Definition
5
Importing Data
4
Add, Edit, Delete Categories
5
Relationship Types
4
Add, Edit, Delete Relationships
5
Reporting
5
Exporting Data
5
User Access
5 Total Score
38
Average Score
4.75
Strengths • Proven performance with very large data sets.
14
TopBraid term record Hierarchy Term
Relationships
15
TopBraid edit term record
16
TopBraid visualization Graph View
17
TopBraid ratings Functional area
TopBraid
Database Definition
5
Importing Data
5
Add, Edit, Delete Categories
5
Relationship Types
5
Add, Edit, Delete Relationships
5
Reporting
5
Exporting Data
5
User Access
5 Total Score
40
Average Score
5
Strengths • XML RDF under the hood
18
SmartLogic term record Term Hierarchy
Relationships
19
SmartLogic term edit options Add non-preferred terms
Add hier. relationships
20
SmartLogic visualization
Graph View
21
SmartLogic ratings Functional area
SmartLogic
Database Definition
5
Importing Data
4
Add, Edit, Delete Categories
5
Relationship Types
4
Add, Edit, Delete Relationships
5
Reporting
5
Exporting Data
5
User Access
4 Total Score
37
Average Score
4.63
Strengths • Integrations with CMS (SharePoint, Documentum) and search engines (FAST, Google)
22
Summary of taxonomy tool ratings Functional area
MultiTes
SmartLogic
Synaptica
TopBraid
Database Definition
5
5
5
5
Importing Data
3
4
4
5
Add, Edit, Delete Categories
2
5
5
5
Relationship Types
4
4
4
5
Add, Edit, Delete Relationships
2
5
5
5
Reporting
1
5
5
5
Exporting Data
3
5
5
5
User Access
0
4
5
5
Total Score
20
37
38
40
Average Score
2.5
4.63
4.75
5
23
Taxonomy editing tools vendors An immature area– No vendors are in upper-right quadrant!
Ability to Execute
high
Most popular taxonomy editor is MS Excel
High functionality /high cost products ($50-100K)
low
MultiTes is widely used, cheap with functionality
Niche Players
Completeness of Vision
Visionaries
Taxonomy Tools Vendor
Taxonomy Editing Tools
URL
Apelon Distributed Terminology System (DTS
www.apelon.com/Products/DTS/tabid/97/Default.aspx
Cuadra STAR/Thesaurus
www.cuadra.com/products/vocabulary.html
Thesaurus Master
www.dataharmony.com/products/thesaurus_master.html
MS Excel
www.microsoft.com
Intelligent Topic Manager
www.mondeca.com/Products/ITM
MultiTes Pro
www.multites.com
PoolParty Thesaurus Manager
poolparty.punkt.at/poolparty-thesaurus-manager-3-0-releasenotes/
protege
protege.stanford.edu
Semaphore Ontology Manager
www.smartlogic.com/home/products/semaphoremodules/ontology-manager/ontology-manager-overview
Synaptica
www.synapticasoftware.com
SAS Ontology Management
www.sas.com/text-analytics/ontology-management/index.html
Temis Luxid
www.temis.com/?id=201&selt=1
Term Tree
www.termtree.com.au
Top Braid Composer
www.topquadrant.com/products/TB_Composer.html
WordMap Designer
www.wordmap.com
25
Thank You
QUESTIONS
For More Information:
Zach Wahl Email:
[email protected] Twitter: twitter.com/#!/ppc_corp, twitter.com/#!/ZacharyWahl
Joseph Busch Email:
[email protected] Twitter: twitter.com/joebusch
Blog: blog.ppc.com Web: www.ppc.com/Pages/KMWorld2011.aspx
27