End-to-End Approach for making a Facebook bot using a Sequence to Sequence model.
If you arrived here from my blog, here is the link for the code
As a fun school project with two folks, we decided to work on Sequence to Sequence model see this blog, and we spent some time working to make a chatbot that will answer at our place on Facebook. For this, we used our own Facebook Messenger history (Technically some messages were Facebook chat message, I’m getting too old…).
We train the model on a character level because usually on Messenger, syntax, and grammar sucks. But if you have time, you could use this soft. As our conversations were not in English, we didn’t use it, but it might worth and compare.
For this project, you’ll need a Python interpreter (Make sure you installed all dependencies pip install -r requirements.txt
), and a good GPU.
Without any further ado, let’s go :)
1. Retrieve and parse Messenger
To collect your data, you’ll need to go your Facebook, Settings
then click for on Download a copy
, and download it when it is ready.
Then extract your messages.htm
for the archive and parse it in text format with fbchat-archive-parser. Here is the command:
fbcap messages.htm -r -f json > file.json
Connecting to Internet to retrieve user name
The -r
is not obligatory, but then some conversation will be not be affiliated together. Conversations are paired given the name of the writer, and sometimes the same writer will have its name or its ID, which results in different person for the parser. With -r
IDs are replaced, as possible, with sender name.
To parse your conversation nicely, you’ll need parse_messenger_chat.py
. In the file replace main_user
with your name, and you can restrict your bot to be trained only on your answer restrict_answer_to_main_user=True
or on all answers of you and your friends.
Finally, run python parse_messenger_chat.py
2. Train a model
For training a model, you will use the script main.py
.
Inside the code, you can change some parameters of the model, using Tensorflow Flags:
- Number of epochs:
nb_epochs
- Number of layers:
num_layers
- Number of hidden neurons in cells:
num_hidden
- Size of the embeddings for characters:
embedding_size
- Using validation set:
use_attention
(Notes: It does not make a lot of sense on character level models)
Start the training with python main.py
.
Go jogging, have a good sleep and come back the next days.
Computation cost
I suggested you have a good GPU and a lot of RAM on your computer. Otherwise, the model may not be able to be trained.
3. Run a bot online
There are two solutions to branch your “Artificial General Intelligence” to the Internet. Whether through a Page (see this tutorial) or through your profile.
Connect through your Profile
After training, your model should be saved in a new folder, which name is uniquely defining the architecture you used.
You need to find the latest trained model and locate the last Meta graph. Then, change in bot.py
the logdir
and the model_name
variables.
For connecting to Messenger, I used fbchat.
Finally, run the script python bot.py
Your bot is now answering all messages send to you. Messages are answered sequentially in the order of receiving. There is always a short delay :).
Have fun.
Acknowledgments
This project was done with