How we do it
Step 1: NLP & Word Vectors
SpaCy
English multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.
Step 2: Dimension Reduction
SpaCy
Performs Principle Component Analysis (PCA) to reduce the number of dimensions to 2.
Step 3: Fourier Transform
Fourier
Compute a Discrete Fourier Transform (DFT) on the points in space.
Step 4: Epicycle Drawing
SpaCy
An Epicycle is a small circle whose center moves around the circumference of a larger one.
What can you change?
Text Analysis Methods
  • Word Embedding
    Uses word vectors as points in the path
  • Sentence Embedding
    Uses sentence vectors as points in the path
  • Sentence Grammar Space
    Uses sentence statistics (i.e. number of adjectives / number of words) as points in the path
  • Document Grammar Statistics
    Takes document statistics as values for each epicycle.
Pattern
  • Connect by Distance
    Connects the dots in the order based on the next closest point.
  • Connect by Word Order
    Connects the dots in the order the words appear in the text.
  • Flower by Distance
    Path goes back to the center after each point moving to the next closest point.
  • Flower by Word Order
    Path goes back to the center after each point moving to the next word in the text.