Part one of a series of our journey building juptr.io:
How a fundamental abstraction streamlines, simplifies and speeds up developing a web scale system
goes higher, faster and avoids dirty feet :)
Juptr.io is a content personalization, sharing and authoring platform. We crawl 10'th of thousands of blogs and media sites (german + english), then classify the documents to enable personalized content curation, consumption & discussion.
to crawl, store, query, analyze and classify millions of documents
a failure resistant distributed (micro)services architecture
Compared to ordinary enterprise-level projects, a startup has much tighter time and budget constraints (aka startup in europe), so reduction of development effort was a major priority.
We decided to use the following foundation stack .. [this blog has moved, continue reading here]
Word disambiguation combining LDA and word embeddings
Topical search tries to take into account the occurence of search-term-related words to avoid matching irrelevant documents. This post lines out how LDA can be used to disambiguate a word-set obtained using word embeddings.
Texting Bots: The command line interface rebranded ?
Texting Bots and Chat Bots are all the hype, but is typing to a mostly dumb AI the way users want to interact with software ? Maybe other aspects besides AI and texting are the real killer feature of chat bots.