Monday, December 22, 2014

Dictonomy- The Crowdsourced Hierarchical Dictionary

Welcome to Dictonomy. Yeh, it's a lousy name, but so was Airbnb before you repeated it a few thousand times. And 'lexonomy'—a more mellifluous choice—was taken. So dictonomy it is... a DICTionary-taxONOMY of the English language.
  • What
    Dictonomy is a hierarchical dictionary of English. That means you can look up a word and find it in a tree or outline structure of related words. The words above the term are broader in meaning; the words below it in the tree are narrower in meaning. Note that this is about concepts, not quite the same entities that are in the alphabetical dictionary. The main example I've spotted of this difference concerns "parts of speech" (the different formats of a word). So, for instance, the concept of "empathy" might have a node in the tree that serves for the noun, verb (empathize), and adjective (empathetic) forms of the word. Dictonomy might have only one entry. This brings up the concept of the "native" format of a concept: some concepts seem to have naturally evolved starting as verbs, some as nouns, etc. We'll have to work that out, and 1,000 other things. One of the toughest things to work out will be "prime words." I think you read that first here (please tell me of any original use of the term 'prime word and I'll properly attribute it). Prime words are words that have no broader term. I think good and bad might be examples.
  • Why
    Because it's not there already. I've noticed when doing crossword puzzles that words exist in a hierarchy, and often the clues point to answers above or below themselves in the tree. This in turn led to "Bellis's Law (of dictonomy [I have a lot of laws]): Any word can be identified by as few as two other words." Ask NYT Crossword Puzzle editor Will Shortz; I'm sure he can tell you it's true. Notice I said "identified," not "defined."
  • Who and everyone else who wants to help create it.
  • How
    By using Twitter as a never-ending, probably circular, editorial platform. The circular part is probably OK because this outline will not be a neat, flat tree of mutually-exclusive entries. Like the human brain, which it essentially maps, it will be endlessly labyrinthine. Poetic justice, or more aptly, poetic accuracy.

    Recalling the book The_Professor_and_the_Madman about the creation of the Oxford English Dictionary, I'm also OK with this process being endless and piecemeal. The OED was apparently a zillion slips of paper... a proto-Twitter if ever there was.

    But whereas the OED was first-and-foremost a log of first-usage examples (love that idea), Dictonomy will twist that notion like a Möbius strip: there won't be definitions or examples or nuthin... just the broader-narrower relationships shown in a tree. Over time, yes, definitions and examples might be added, but most of that should probably be left to other sources. What might be unavoidable, however, is to add "relationships," meaning the concept that connects two nodes in the tree. This will start to be familiar to some as the "triplet" world of ontology and so forth. But I'm not an expert in that stuff so it's time for me to stop... and start getting you all to build the tree.

    As for duplicates added by many people, I'd originally thought of a fully-designed software interface to manage the worklist, but never made any progress, so settled on free-form Twitter use. And what's wrong with verification?
Instructions for Participating
Option 1: Follower (Following @dictonomy on Twitter)
  1. Follow @dictonomy on Twitter.
  2. I'll try to post a word-of-the-day sort of term, if not daily, then occasionally.
  3. Post your own tweet that attempts to define the area of the tree in which the term appears, as follows:
    1. Include @dictonomy and #dictonomy in the tweet. I really don't know which is the minimum because I haven't set up Part 2 of this whole thing, using the resultant dictonomy. Stop laughing. If Twitter's search capabilities aren't sufficient it will have to be some programming thing (Twitter's API- Application Programming Interface). So add both the @ and hashtag or email me if you know the best solution.
    2. Include the term between angle brackets to indicate broader and narrower terms as follows:
      1. broader term>term>narrower term
      2. animal>dog>Great Dane
    3. You can include whatever terms you want... just a broader term, just a narrower term, or both.
    4. Just put one relationship on each row/sentence. Only time will tell if one 'statement' per tweet is best.
    5. Don't includes comment or definitions or any other noise.
    6. Shorthands and other relationship notations such as colons and semicolons might also evolve over time, but they should be avoided initially because we'll get a lot of mixed format data and, Twitter being uneditable, could be a mess.
      1. Notation examples to avoid initially: List: dog>Great Dane;Schnauzer;Chihuahua
      2. Notation examples to avoid initially: Sibling: dog::canine
Option 2: Ad-Hoc/Your Own Term
Use this option when you find yourself thinking about the sense of any word, and want to add it to the dictonomy.
  1. Post your own tweet for that word, using the rules above in Option 1, starting at step 3. 
Option 3: Proactive Editing

Use this option if you want to just work on the project without any particular starting point.
  1. Find any interesting term such as from Roget's Thesaurus, by drilling down to the bottom level at 
  2. Post your own tweet for that word, to add it to the dictonomy, using the rules above in Option 1, starting at step 3.



    No comments:

    Post a Comment