Projects

1. Stanford NLP Lecture Transcription using OpenAI’s Whisper

Whisper is an automatic speech recognition (ASR) model trained on hours of multilingual and multitask supervised data. It is implemented as an encoder-decoder transformer architecture where audio are splitted into 30 seconds of chunks, converted into a log-Mel spectrogram, and then passed into an encoder. The decoder is trained to predict the corresponding text caption, intermixed with special tokens that direct the single model to perform tasks such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation. For more info about whisper, read here.

I used whisper model to transcribe Stanford NLP lectures into corresponding text captions. Here is the result of the transcribed lectures. This web app is build using Flask and deployed on AWS EC2 instance. You can find transcribed audio file in the form of text here.

2. Custom Named Entity Recognizer for clinical data

Named Entity Recognition (NER) is a subtask of Natural Language Processing (NLP) that involves identifying and categorizing named entities in text.

I have developed a custom named entity recognition (NER) model for clinical data using the spacy framework and deployed it using Streamlit. The model is capable of identifying various entities such as diseases, treatments, medications, and anatomical locations from clinical text data. The model classifies entities based on three classes: ‘MEDICINE’, “MEDICALCONDITION”, and “PATHOGEN”. The dataset was used from kaggle. You can try the application on this link

3. Question Answering using Langchain and OpenAI

This application provides a simple example of how to build a question-answering system using Langchain and pre-trained language models from OpenAI and Streamlit.

Langchain helps to build Large Language Models (LLMs) through composability. It helps to combine large language models with other sources of computation.

I developed a question answering system using Langchain with OpenAI embeddings. Since, LLMs tends to have fixed context length, Langchain helps to eliminate this issue by introducing chains, where we can break the document into different chunks and run the chain on the whole document. In this application, when a user uploads a file, the contents are converted into embeddings using OpenAI embeddings and stored in Pinecone vector database. Storing embeddings this way, helps for faster retrieval of the embeddings. When a user enters the query, similarity search is conducted to retrieve the similar embeddings from the vector store and the langchain chain passes the formatted response to the LLM.