Meet Poe, our Digital Concierge

Like many teams in the tech industry, a lot of our communication happens in Slack – and much of it in GIF form. Our conversing through the medium of looping kitties and sloths is made possible by two Slack apps with access to different image collections. While finding the perfect animated response is an important feature, integration APIs offered by popular chat-oriented tools like Slack, Microsoft Teams, Zulip and Fleep are extending the functionality of these platforms to offer genuine collaboration and productivity benefits.

For us, it's a poll bot to survey the team, the Google Drive app for convenient access to our documents, and the Trello app for pushing activity notifications from project boards into the corresponding project channel. But these APIs are not just targeted at third-party developers. Many organisations are realising the value of developing internal integrations that define custom workflows specific to their team, or company, that can help perform repeated tasks such as document review, onboarding processes, or approval requests.

Presenting Poe

At Forefront, we like to scratch our own itches. After consultation with everyone on the team, and some exploration of the available features of the Slack API, we went about building a Slack integration that would assist us with routine tasks and make day-to-day life easier.

Thus, our digital concierge Poe was born. An ever-helpful fellow, Poe hangs out in our Slack channels and direct messages, ready to help us with various tasks including:

Shared URL Saving

Our team is constantly sharing links to articles and blog posts describing tools, techniques, use-cases, and strategy. While chat tools are a great medium to quickly and easily share links, the potentially valuable content is notoriously difficult to go back and find amongst a consistent stream of communication.

Poe listens for incoming URLs in messages, and for every one that he finds, he dispatches it to a database along with the channel it was saved in, the name of the person who shared the link, and some keywords from the text of the shared article to help with discovery. Poe then makes this information available through a simple web app which allows us to retrieve that specific article we meant to read from a couple of months ago, and also provides us with a valuable knowledge library when performing research into a topic.

AWS EC2 Instance Management

A common workflow we adopt in our projects is to spin-up an AWS EC2 instance for those on the project do their work in. These instances are often fairly well-provisioned in order to support heavy workloads being shared across multiple users, so it’s desirable to switch them off when not in use - however in practice this doesn’t always happen.

To overcome this, Poe checks across the various AWS accounts we use at the end of the workday. If there are any instances running that are not tagged as being OK to do so, Poe will post an alert in the relevant Slack channel, reminding us that we might want to consider turning them off. We also gave Poe some interactive features (implemented as Slack’s slash commands) allowing us to query the list of EC2 instances that are either running or stopped, as well as giving us the ability to shutdown instances from directly within Slack.

Timesheet Reminders

A day or so before timesheets are due, someone will usually jump on the Slack ‘General’ channel and remind everyone to submit them. Over time we have noticed that this reminder does improve timely timesheet submission, however this task isn’t explicitly assigned to anyone, so it doesn’t always happen. We’ve now outsourced this job to Poe, who is pretty good at not forgetting things.

Document Workflow Demo

The last capability we gave Poe is more of a demonstration of how custom software agents can be partnered with cloud-based AI services to assist with custom workflows. Sometimes referred to as intelligent automation or augmented intelligence, this type of application – in which AI is used to enhance the productivity of workers performing cognitively menial tasks – allows people to spend more time focusing on challenges where their expertise can be put to better use.

This feature enables us to upload a document to Poe directly in Slack in a range of formats, such as DOCX, PDF, or even a scanned JPG or PNG. Poe consumes the document and stores it safely in a collection which has search capabilities, while also processing the document’s contents to extract information such as contact and organisation names and locations. This means that in addition to being able to search the text of documents, we can also perform queries like “show me all documents submitted between 1/8/2018 and 1/8/2019 that include the name Kim Sanders.”

While we targeted a somewhat generic and hypothetical document workflow for our demonstration, in practice this kind of application would be customised to target a concrete workflow within a specific organisation once a solid business case has been established.

Poe’s Architecture

While Poe hangs out with us in Slack, his real home is in the cloud, being composed of a range of AWS services.

  • Poe’s core component—the part that communicates with the Slack JSON API—is a Python Flask app running on AWS Lambda and AWS API Gateway, with his EC2 manipulation skills being provided by the Boto3 library.
  • The amazing Zappa library makes the process of taking an existing Flask app and giving it a serverless deployment using these services an absolute breeze.
  • Poe also uses AWS S3 for storing uploaded documents, AWS Aurora (a serverless SQL database) for storing document metadata and shared URL data.
  • Poe’s ability to ingest a range of document formats is provided by AWS Textract, and his ability to extract names and entities from the documents is provided by AWS Comprehend.
  • Finally, AWS Cognito allows us to make sure that only authorised users are able to log into Poe’s web interface.

We opted for all serverless technologies which makes Poe extremely cheap to run since we only pay for what we use, which is not very much for the kind of workload that Poe services. Additionally, we were able to take advantage of the diverse range of AWS services on offer, selecting just the right ones and assembling them like LEGO. The use of AWS Textract and AWS Comprehend were particularly handy accelerators, empowering Poe with AI-based document processing that would otherwise have taken a considerable amount of development time to integrate with other third-party products.

Untangling the Buzzwords

Everyone seems to be talking about bots and chatbots these days – but what exactly are they, and when do you need them? Let’s use Poe as an example.

We think of Poe as a software agent or bot. He has some degree of autonomy, being able to act without explicit user-intervention. He also has a range of capabilities which vary depending on the context he finds himself in, and he will respond to user requests. His ability to perform more complex tasks, such as identifying the names of people and organisations within documents, is starting to land him in the space of an intelligent agent, although he probably requires a few more smarts for this label.

Poe’s EC2 management skills also places him in the realm of the ChatOps paradigm, which involves the use of chat clients or chatbots to provide monitoring and system orchestration facilities for teams within their instant messaging platform. However, we wouldn’t call Poe a chatbot currently as he doesn’t come equipped with a conversational interface. A chatbot is simply a piece of software that a user interacts with through natural language. Much of the hype around this technology is due to people thinking about chatbots as AI that can be used to fully automate large portions of customer/user interaction. This has led to a pronounced mismatch between expectation and reality. A better mental model for this technology is to think of a conversational agent helping to elicit information from a user in order to fill out a form or navigate an information pathway [1].

While a conversational interface may not sound as impressive, this type of application has not been readily accessible for developers until recently, with services like AWS Lex, Azure Bot Service, and DialogFlow providing toolkits that include intent classification models which enable the automatic pattern matching of natural language user input with target responses. These models are based on advances in natural language processing which previously required machine learning specialists to train and deploy. Poe currently does not feature any dialog flows, however this facility would make a natural extension to his capabilities. Conversational interfaces can be particularly useful in building out custom workflow interactions, such as when needing to elicit information pertaining to a document submission.

Ultimately, it doesn’t really matter what term you use to describe these technologies. When evaluating their suitability for your organisation, it’s important to know what they can and can’t do, and to really understand your business processes, workflows and current friction points within them. If you’re at that point and you have a strong instant message-based communication culture, you could be in a position to consider making your own digital concierge.

[1] https://medium.com/swlh/a-natural-language-user-interface-is-just-a-user-interface-4a6d898e9721