A Zero-shot ReAct agent that utilizes toolkits designed for data acquisition, manipulation, and processing. This environment contains everything you need to source, create, and deploy your machine learning models and chat with them using your configured agent.
miniAGI is a Streamlit application designed to provide a chat interface with AGI (Artificial General Intelligence) capabilities. It leverages various toolkits to answer user queries with decision-making based on the plan and execution model from the Langchain framework.
Plugins are available at https://github.com/tdolan21/miniAGI-plugins
I recognize the immediate criticism surrounding the fact that AGI applications can be heavy token consumers. This one has the potential to be the same based on your intended use. However, the the agent in this application is a zero shot ReAct agent that is capable of choosing which tool to use simply by its description. Which gives the user more control over the types of keywords or phrases that trigger a tool being used.
The toolkit included in miniAGI is geared towards a machine learning environment where you can acquire data, manipulate it, store it, and recall it using your preferred AI model. The tools are pre-configured for demos that are conducive to learning how to use machine learning models and can be easily configured for use with your own machine learning tools. The toolkit for deeplake is specially useful for this as they have over 250 datasets that can be used with the plan and execution agent. This gives you a head start to your project no matter the level. You can then query this dataset through vector search including images if you so choose.
Once your machine learning model is complete and you have a rich dataset, you can upload it to Banana or Potassium server for a cloud or local deployment to chat with the model using the plan and exexution agent.
These tools are separate and are not all used in the same context. This application is intended to represent a fully functional data acquisition platform that allows the end2end prodution of machine learning models.
Considering this, all functionality of this application is experimental and should be treated as such. I will be including a docker image soon if you do not wish to download all of the different required programs locally. They are all required to run the application, but an API key is required to use the different services. This is another reason why its better to only include the toolsets where they are needed rather than to allow the opportunity to give the agent too much freedom.
The results of a more targeted goal and toolset include a more predictable experience where you can focus on experimentation with the plan and execute agent at whatever level you wish, rather than having it eat all your tokens and cost more. The agents will get there, but having more realistic goals and use cases is very important along the way.
This agent does not have the freedom to use shell commands or freely access files on your machine. The only files it has access to locally are the files you put in the documents folder and the subdirectories required for each tool.
This guide will serve as a ‘quick install’ for linux machines:
git clone https://github.com/tdolan21/miniAGI.git
cd miniAGI
mv .env.example .env
Once you have created you own .env file, you need to fill out the environment variables and run:
bash miniAGI.sh
Windows machines MUST use docker
The initial process is the same. Clone the repo, access the directory, create the .env.
git clone https://github.com/tdolan21/miniAGI.git
cd miniAGI
mv .env.example .env
After this is completed you will need to add your actual API credentials to the.env that has been created.
#### Building the image
This will build the image for the first time. It may take a few minutes. You have time to get more coffee.
docker build -t miniagi .
#### Using Docker Compose
docker-compose up # To start the services
docker-compose down -v # To stop the services
This application does NOT work locally with windows
More information can be found at https://github.com/pgvector/pgvector
This section outlines what the miniAGI.sh script is doing.
The reccommended environment to run this application is through a conda virtual environment.
If you are unfamillar with conda here are the install instructions for Linux
conda create --name miniAGI python=3.9
conda activate miniAGI
If you want to use an nVidia GPU as compute for rendering machine learning models in the various modules, you should initialize the conda env like this before using pip install.
Ensure that the environment is activated before continuing this portion.
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
If you just intend on using CPU compute, then a simple pip install will be enough once the conda environment has been created.
pip install -r "requirements.txt"
streamlit run app.py
You will get this error:
miniagi-db-1 | 2023-09-11 17:19:38.073 UTC [68] FATAL: password authentication failed for user "postgres"
miniagi-db-1 | 2023-09-11 17:19:38.073 UTC [68] DETAIL: Connection matched pg_hba.conf line 100: "host all all all scram-sha-256"
miniagi-db-1 | 2023-09-11 17:19:38.090 UTC [69] FATAL: password authentication failed for user "postgres"
miniagi-db-1 | 2023-09-11 17:19:38.090 UTC [69] DETAIL: Connection matched pg_hba.conf line 100: "host all all all scram-sha-256"
This error is a bug with the usage of the pgvector docker image in conjunction with using standard postgreSQL functionality.
However, even with the error present, database connection can be accurately proven through the different feautres and everything still works as expected.
The database should not throw this error as the connection is made via either SHA-256 password or trust on local networks.
If you wish to change this or attempt to fix the bug, the file you will need is the pg_hba.conf and is located at:
/var/lib/postgresql/data
This application requires installation of python and pip. Once you have python and pip installed, you should follow the steps outlined in the Quick Install. This will leave you with a Python virtual environment that is configured for this application and is not affecting your personal machine. The requirements for this project are rather robust, but provide a richer experience than not using these tools. I consider this trade-off to be worth it. You may not and thats okay too.
First, You need to install PostgreSQL for your platform. Here is the download link for Windows and Linux.
Its recommended to also install pgAdmin during the install for PostgreSQL 15 so you can verify each feature is working easily.
Once you have PostgreSQL and pgAdmin installed, open pgAdmin and follow the install instructions listed in the pgVector link. This extension is not formally supported, but is an incredible addition to PostgreSQL allowing concurrent storage of both chat history, and the theoretical “brain” for your agent. This allows the user to further fine tune machine learning models that may be lacking in specific areas.
PGVector is an extension for PostgreSQL that allows users to perform vector search within the same database as their traditional data. If you can store it on PostgreSQL, you can perform vector search with it by placing whatever documents you would like into the documents folder. This will load them into the vector extension of your database, allowing you to perpetually store documents and grow your vector search databse over time.
My preferred use case for this feature is to load the documentation to your ideal tech stack documentation into the documents to create your own agent toolkit. This will vectorize the documents and allow you to query them as you please.
This application requires the use of several different API keys to use all the available features. You do not have to have the API keys for the application to work. Adding API keys essentially unlocks different features available throughout the application. If you do not have the neccesary API keys you will see an error thrown where they are used. In the future some of these features will be released as independent applications because they may be helpful to people as more specialized tools rather than a part of experimentation.
These are the API keys required for each section and they are cumulative, meaning you cant skip the first section and many things are required throughout:
I have started a plugin repository named miniAGI-plugins. This repo is currently under construction, but I have several plugins already functioning. I just need to standardize them and develop a structure for everyone to use.
These plugins include game simulations in a gymnasium environment and a debate simulation with an agent moderator. They will debate an ethical issue with a search and memory toolkit.
Other examples include the ability to chat with your discord data that can be retrieved via discord. The descriptions below are examples of the plugins currently available.
Here is some more information on Claude. Claude allows for longer context input for large datasets and very high level natural language processing. MiniAGI combines that with the ability to use other tools, store your results locally, and chain together more complex queries at once rather than needing to use one query at a time. MiniAGI has the potential to triple your productivity with Claude depending on your specific usecase, your mileage may vary.
Banana: This service allows the user to host their own machine learning models from the cloud and use them as an API in applications. It is integrated to this application in the same way as the other serivces, being the plan and execution agent. This service is costly and is a production machine learning environment. This is not required at all, but is very useful if you are building your own chat model or another model where you want to be able to use metrics in your other research. The default model that is configured is TheBloke/WizardLM-1.0-Uncensored-Llama2-13B-GPTQ, but you will have to configure and deploy whatever model you wish. The WizardVicuna ideaology was originally created by Eric Hartford, but on this project, the quantized version from TheBloke is used.
Potassium: If you are still working on your project you can connect this via the Potassium server functionality and then connect your local integration rather than a cloud integration. This allows the user to save money on deployment costs while still maintaining a similar testing experience.
- This agent uses The QARetrievalChain and the Deeplake Hub to create a vector search database out of your existing codebase. The user can define a location to create the vectorstore and it will discover all of the available python files in that folder structure and split them in half. It could be an odd number because it keeps the structure of the functions rather than a hard character limit.
We look forward to your contributions!