Book Bot
This project helps users to train their own chatbot based on a pdf file. By doing so, users could have an intriguing conversation with the book. The text generation function is based on the Markov Chain.




Debugging
01 Text Generation Function

02 Extract Text From a PDF file



- Failed to get a json file automatically
Fixed by: Add a main function at the end of the file
main()
- Failed to get rid of the white-space
At this step, I tried to extract text from pdf to generate a dataset.
First, I tried borb, but it revoids all the white-space, hence I switch to PyPDF2. However, I still can’t get rid of the punctuations, that’s for the future iteration.
Then, I import my generate.py file into a bot.py file to activate it on discord.
I change few parts of the code:
- Add a new function called mainbot to directly load the dictionary.json file(the dataset generated by learn.py, which could split sentences into words and record how many times they show up)
- Duplicate the bot.py from ChatbotGPT, get rid of OpenAI and make the bot response (import the function mainbot()) put the number of word I want it to response into the bracket)
Finally, @ the bot and trigger it on discord, It could generate sentences based on the pdf I extracted as the dataset. Lowkey talking to the author of that article.