文章目录
PandasAI
PandasAI是一个使数据分析变得富有对话性和有趣的库。它利用pandas数据框和最先进的LLMs的强大功能,让用户以对话方式进行数据分析。
与
pandas
所做的类似(10分钟入门pandas -> https://pandas.pydata.org/docs/user_guide/10min.html),我们希望创建最简单的方式来学习如何掌握PandasAI。
让我们开始吧!
设置
要开始使用,我们需要安装最新版本的PandasAI。
# 安装pandasai库
!pip install pandasai
Collecting pandasai
Downloading pandasai-1.2.7-py3-none-any.whl (73 kB)
[?25l [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/73.2 kB[0m [31m?[0m eta [36m-:--:--[0m
[2K [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m71.7/73.2 kB[0m [31m2.2 MB/s[0m eta [36m0:00:01[0m
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.2/73.2 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting astor<0.9.0,>=0.8.1 (from pandasai)
Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB)
Requirement already satisfied: duckdb<0.9.0,>=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai) (0.8.1)
Collecting ipython<9.0.0,>=8.13.1 (from pandasai)
Downloading ipython-8.15.0-py3-none-any.whl (806 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m806.6/806.6 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: matplotlib<4.0.0,>=3.7.1 in /usr/local/lib/python3.10/dist-packages (from pandasai) (3.7.1)
Collecting openai<0.28.0,>=0.27.5 (from pandasai)
Downloading openai-0.27.10-py3-none-any.whl (76 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: pandas==1.5.3 in /usr/local/lib/python3.10/dist-packages (from pandasai) (1.5.3)
Requirement already satisfied: pydantic<2,>=1 in /usr/local/lib/python3.10/dist-packages (from pandasai) (1.10.12)
Collecting python-dotenv<2.0.0,>=1.0.0 (from pandasai)
Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Requirement already satisfied: scipy<2.0.0,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from pandasai) (1.11.2)
Collecting sqlalchemy<2.0.0,>=1.4.49 (from pandasai)
Downloading SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m43.8 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai) (2023.3.post1)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai) (1.23.5)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (0.2.0)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (4.4.2)
Collecting jedi>=0.16 (from ipython<9.0.0,>=8.13.1->pandasai)
Downloading jedi-0.19.0-py2.py3-none-any.whl (1.6 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m63.3 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (0.1.6)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (3.0.39)
Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (2.16.1)
Collecting stack-data (from ipython<9.0.0,>=8.13.1->pandasai)
Downloading stack_data-0.6.2-py3-none-any.whl (24 kB)
Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (5.7.1)
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (1.1.3)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai) (4.8.0)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (4.42.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai) (3.1.1)
Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai) (2.31.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai) (4.66.1)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai) (3.8.5)
Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<2,>=1->pandasai) (4.5.0)
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy<2.0.0,>=1.4.49->pandasai) (2.0.2)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython<9.0.0,>=8.13.1->pandasai) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython<9.0.0,>=8.13.1->pandasai) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython<9.0.0,>=8.13.1->pandasai) (0.2.6)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas==1.5.3->pandasai) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai) (3.2.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai) (2023.7.22)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai) (23.1.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai) (4.0.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai) (1.3.1)
Collecting executing>=1.2.0 (from stack-data->ipython<9.0.0,>=8.13.1->pandasai)
Downloading executing-1.2.0-py2.py3-none-any.whl (24 kB)
Collecting asttokens>=2.1.0 (from stack-data->ipython<9.0.0,>=8.13.1->pandasai)
Downloading asttokens-2.4.0-py2.py3-none-any.whl (27 kB)
Collecting pure-eval (from stack-data->ipython<9.0.0,>=8.13.1->pandasai)
Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
Installing collected packages: pure-eval, executing, sqlalchemy, python-dotenv, jedi, asttokens, astor, stack-data, openai, ipython, pandasai
Attempting uninstall: sqlalchemy
Found existing installation: SQLAlchemy 2.0.20
Uninstalling SQLAlchemy-2.0.20:
Successfully uninstalled SQLAlchemy-2.0.20
Attempting uninstall: ipython
Found existing installation: ipython 7.34.0
Uninstalling ipython-7.34.0:
Successfully uninstalled ipython-7.34.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires ipython==7.34.0, but you have ipython 8.15.0 which is incompatible.
ipython-sql 0.5.0 requires sqlalchemy>=2.0, but you have sqlalchemy 1.4.49 which is incompatible.[0m[31m
[0mSuccessfully installed astor-0.8.1 asttokens-2.4.0 executing-1.2.0 ipython-8.15.0 jedi-0.19.0 openai-0.27.10 pandasai-1.2.7 pure-eval-0.2.2 python-dotenv-1.0.0 sqlalchemy-1.4.49 stack-data-0.6.2
SmartDataframe
SmartDataframe是一个继承了
pd.DataFrame
的pandas(或polars)数据框,它除了具有
pd.DataFrame
的所有属性和方法外,还添加了对话功能。
# 导入pandasai库中的SmartDataframe类from pandasai import SmartDataframe
您可以通过从多个不同的来源实例化一个数据框架(pandas或polars数据框架、csv、xlsx或Google Sheets)。
从pandas数据框导入
要从pandas dataframe导入数据,您需要先导入pandas库并创建一个dataframe。
# 导入pandas库import pandas as pd
# 创建一个DataFrame对象,包含国家、GDP和幸福指数的数据
df = pd.DataFrame({"country":["United States",# 美国"United Kingdom",# 英国"France",# 法国"Germany",# 德国"Italy",# 意大利"Spain",# 西班牙"Canada",# 加拿大"Australia",# 澳大利亚"Japan",# 日本"China",# 中国],"gdp":[19294482071552,# GDP数据2891615567872,2411255037952,3435817336832,1745433788416,1181205135360,1607402389504,1490967855104,4380756541440,14631844184064,],"happiness_index":[6.94,7.16,6.66,7.07,6.38,6.4,7.23,7.22,5.87,5.12],# 幸福指数数据})
由于PandasAI由LLM提供支持,您应该导入您想要用于您的用例的LLM。在这种情况下,我们将使用OpenAI。
要使用OpenAI,您需要一个API令牌。按照以下简单步骤生成您的API_TOKEN:
openai。
- 访问https://openai.com/api/并使用您的电子邮件地址注册或连接您的Google帐户。
- 在个人帐户设置的左侧,点击"View API Keys"。
- 选择"Create new Secret key"。
访问openai的API是一个付费服务。在进行实验之前,请阅读Pricing信息。
# 导入OpenAI类from pandasai.llm import OpenAI
# 创建一个OpenAI对象,并传入api_token参数
llm = OpenAI(api_token="YOUR TOKEN")
现在我们已经实例化了LLM,我们终于可以实例化
SmartDataframe
了。
# 创建一个SmartDataframe对象,并传入一个DataFrame对象df和一个配置参数config={"llm": llm}
sdf = SmartDataframe(df, config={"llm": llm})
一个
SmartDataframe
继承了原始数据框的所有方法和属性。例如:
# 使用条件筛选,返回country列为'United States'的行
result = sdf[sdf['country']=='United States']# 打印结果print(result)
但是您也可以用自然语言进行查询。
# 调用chat函数,参数为"Return the top 5 countries by GDP"
sdf.chat("Return the top 5 countries by GDP")
# 调用chat函数,并传入一个问题作为参数
sdf.chat("What's the sum of the gdp of the 2 unhappiest countries?")
# 打印出sdf对象的last_code_generated属性的值print(sdf.last_code_generated)
def analyze_data(dfs: list[pd.DataFrame]) ->dict:
df_combined = pd.concat(dfs)
df_sorted = df_combined.sort_values('happiness_index')
sum_gdp = df_sorted.head(2)['gdp'].sum()
return {'type': 'number', 'value': sum_gdp}
result = analyze_data(dfs)
绘制图表
您还可以使用PandasAI轻松绘制图表
# 调用chat函数,传入参数"Plot a chart of the gdp by country",并输出结果
sdf.chat("Plot a chart of the gdp by country")
您还可以提供额外的指示。例如,假设您想为每个柱状图使用不同的颜色。您只需要向PandasAI提出要求即可。
# 使用seaborn库中的chat函数绘制直方图# 参数为gdp和country,表示按照国家绘制gdp的直方图# 每个直方图的颜色不同
sdf.chat("Plot a histogram of the gdp by country, using a different color for each bar")
作为一种替代方法,您可以使用
shortcuts
。快捷方式是一种函数,可以避免您编写提示并在幕后为您执行"魔法"。
例如,您可以使用
.plot_bar_chart()
生成相同的图表,提供字段:
# 绘制柱状图
sdf.plot_bar_chart(x="country", y="gdp")
因此,例如,如果我们想要将其可视化为饼图,您可以调用
plot_pie_chart
快捷方式,传递我们想要用作标签的字段和我们想要用作值的字段。
# 绘制饼图
sdf.plot_pie_chart(labels="country", values="gdp")
智能数据湖
有时候,您可能希望同时处理多个数据框,让LLM来协调使用哪一个来回答您的查询。在这种情况下,您应该使用
SmartDatalake
而不是
SmartDataframe
。
这个概念与
SmartDataframe
非常相似,但是它可以接受多个数据框作为输入,而不仅仅是一个。
# 导入SmartDatalake模块from pandasai import SmartDatalake
例如,在这个例子中,我们提供了两个不同的数据框。
在第一个数据框中,每个员工报告了员工编号、姓名和部门。
而在第二个数据框中,提供了员工编号和每个员工的薪水。
询问PandasAI,它将通过员工编号将这两个不同的数据框连接起来,并找出薪水最高的员工的姓名。
# 创建员工信息的数据框
employees_df = pd.DataFrame({"EmployeeID":[1,2,3,4,5],"Name":["John","Emma","Liam","Olivia","William"],"Department":["HR","Sales","IT","Marketing","Finance"],})# 创建薪资信息的数据框
salaries_df = pd.DataFrame({"EmployeeID":[1,2,3,4,5],"Salary":[5000,6000,4500,7000,5500],})# 创建SmartDatalake对象,并将员工信息和薪资信息作为参数传入
lake = SmartDatalake([employees_df, salaries_df],
config={"llm": llm})# 调用chat方法,传入问题"Who gets paid the most?",返回结果
lake.chat("Who gets paid the most?")
这是一个生成的代码示例:
# 打印变量lake中存储的最后一次执行的代码print(lake.last_code_executed)
def analyze_data(dfs: list[pd.DataFrame]) ->dict:
"""
Analyze the data
1. Prepare: Preprocessing and cleaning data if necessary
2. Process: Manipulating data for analysis (grouping, filtering, aggregating, etc.)
3. Analyze: Conducting the actual analysis (if the user asks to plot a chart save it to an image in exports/charts/temp_chart.png and do not show the chart.)
4. Output: return a dictionary of:
- type (possible values "text", "number", "dataframe", "plot")
- value (can be a string, a dataframe or the path of the plot, NOT a dictionary)
Example output: { "type": "text", "value": "The average loan amount is $15,000." }
"""
merged_df = pd.merge(dfs[0], dfs[1], on='EmployeeID')
max_salary_employee = merged_df.loc[merged_df['Salary'].idxmax()]
employee_name = max_salary_employee['Name']
return {'type': 'text', 'value': f'The employee who gets paid the most is {employee_name}.'}
好的,在这种情况下很容易:两个表都共享一个名为
EmployeeID
的公共值,对吗?
让我们试试更复杂的情况
# 创建一个包含用户信息的DataFrame
users_df = pd.DataFrame({"id":[1,2,3,4,5],"name":["John","Emma","Liam","Olivia","William"]})# 创建一个名为"users"的SmartDataframe对象,用于处理用户信息
users = SmartDataframe(users_df, name="users")# 创建一个包含照片信息的DataFrame
photos_df = pd.DataFrame({"id":[31,32,33,34,35],"user_id":[1,1,2,4,5]})# 创建一个名为"photos"的SmartDataframe对象,用于处理照片信息
photos = SmartDataframe(photos_df, name="photos")# 创建一个SmartDatalake对象,将"users"和"photos"作为参数传入,并设置配置项
lake = SmartDatalake([users, photos], config={"llm": llm})# 调用SmartDatalake对象的chat方法,向其提问"John上传了多少张照片?"
lake.chat("How many photos has been uploaded by John?")
在这种情况下,我们为每个数据框提供了一个表名,这样LLM就有了一些上下文,并且可以更好地执行连接操作。正如您在下面的示例中所看到的,它成功地找出了正确的连接方式。实际上,用户"John"实际上有2张照片。
# 打印lake变量中存储的最后一次执行的代码print(lake.last_code_executed)
def analyze_data(dfs: list[pd.DataFrame]) ->dict:
users = dfs[0]
photos = dfs[1]
merged_df = pd.merge(users, photos, left_on='id', right_on='user_id')
john_photos = merged_df[merged_df['name'] == 'John']
num_photos = john_photos.shape[0]
return {'type': 'number', 'value': num_photos}
result = analyze_data(dfs)
不同的LLM
尽管目前OpenAI GPT3.5和GPT4是推荐的模型,我们也支持其他模型,如Starcoder和Falcon。
您可以按照以下方式使用它们:
# 导入所需的库from pandasai import SmartDataframe
from pandasai.llm import Starcoder, Falcon
# 创建一个Starcoder对象,并传入API令牌
starcoder_llm = Starcoder(api_token="YOUR TOKEN")# 创建一个Falcon对象,并传入API令牌
falcon_llm = Falcon(api_token="YOUR TOKEN")# 使用Starcoder对象创建一个SmartDataframe对象,并传入数据框和配置参数
df1 = SmartDataframe(df, config={"llm": starcoder_llm})# 使用Falcon对象创建一个SmartDataframe对象,并传入数据框和配置参数
df2 = SmartDataframe(df, config={"llm": falcon_llm})# 打印使用df1对象进行的聊天操作的结果print(df1.chat("Which country has the highest GDP?"))# 打印使用df2对象进行的聊天操作的结果print(df2.chat("Which one is the unhappiest country?"))
LangChain LLMs
在某些情况下,您可能希望使用LangChain LLMs。
# 安装pandasai[langchain]模块
!pip install pandasai[langchain]
Requirement already satisfied: pandasai[langchain] in /usr/local/lib/python3.10/dist-packages (1.1.1)
Requirement already satisfied: astor<0.9.0,>=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (0.8.1)
Requirement already satisfied: ipython<9.0.0,>=8.13.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (8.15.0)
Requirement already satisfied: matplotlib<4.0.0,>=3.7.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (3.7.1)
Requirement already satisfied: openai<0.28.0,>=0.27.5 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (0.27.10)
Requirement already satisfied: pandas==1.5.3 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.5.3)
Requirement already satisfied: pydantic<2,>=1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.10.12)
Requirement already satisfied: python-dotenv<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.0.0)
Requirement already satisfied: scipy<2.0.0,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.10.1)
Collecting langchain<0.0.200,>=0.0.199 (from pandasai[langchain])
Downloading langchain-0.0.199-py3-none-any.whl (1.0 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[langchain]) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[langchain]) (2023.3)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[langchain]) (1.23.5)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.2.0)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (4.4.2)
Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.19.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.1.6)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (3.0.39)
Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (2.16.1)
Requirement already satisfied: stack-data in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.6.2)
Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (5.7.1)
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (1.1.3)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[langchain]) (4.8.0)
Requirement already satisfied: PyYAML>=5.4.1 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (6.0.1)
Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (2.0.20)
Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (3.8.5)
Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (4.0.3)
Collecting dataclasses-json<0.6.0,>=0.5.7 (from langchain<0.0.200,>=0.0.199->pandasai[langchain])
Downloading dataclasses_json-0.5.14-py3-none-any.whl (26 kB)
Collecting langchainplus-sdk>=0.0.9 (from langchain<0.0.200,>=0.0.199->pandasai[langchain])
Downloading langchainplus_sdk-0.0.20-py3-none-any.whl (25 kB)
Requirement already satisfied: numexpr<3.0.0,>=2.8.4 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (2.8.5)
Collecting openapi-schema-pydantic<2.0,>=1.2 (from langchain<0.0.200,>=0.0.199->pandasai[langchain])
Downloading openapi_schema_pydantic-1.2.4-py3-none-any.whl (90 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m90.0/90.0 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (2.31.0)
Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain<0.0.200,>=0.0.199->pandasai[langchain]) (8.2.3)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (4.42.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[langchain]) (3.1.1)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai[langchain]) (4.66.1)
Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<2,>=1->pandasai[langchain]) (4.7.1)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (23.1.0)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (3.2.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (6.0.4)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (1.3.1)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.6.0,>=0.5.7->langchain<0.0.200,>=0.0.199->pandasai[langchain])
Downloading marshmallow-3.20.1-py3-none-any.whl (49 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.6.0,>=0.5.7->langchain<0.0.200,>=0.0.199->pandasai[langchain])
Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.2.6)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas==1.5.3->pandasai[langchain]) (1.16.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (2023.7.22)
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy<3,>=1.4->langchain<0.0.200,>=0.0.199->pandasai[langchain]) (2.0.2)
Requirement already satisfied: executing>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython<9.0.0,>=8.13.1->pandasai[langchain]) (1.2.0)
Requirement already satisfied: asttokens>=2.1.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython<9.0.0,>=8.13.1->pandasai[langchain]) (2.3.0)
Requirement already satisfied: pure-eval in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython<9.0.0,>=8.13.1->pandasai[langchain]) (0.2.2)
Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.6.0,>=0.5.7->langchain<0.0.200,>=0.0.199->pandasai[langchain])
Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Installing collected packages: mypy-extensions, marshmallow, typing-inspect, openapi-schema-pydantic, langchainplus-sdk, dataclasses-json, langchain
Successfully installed dataclasses-json-0.5.14 langchain-0.0.199 langchainplus-sdk-0.0.20 marshmallow-3.20.1 mypy-extensions-1.0.0 openapi-schema-pydantic-1.2.4 typing-inspect-0.9.0
然后您可以将它们用作PandasAI LLMs。
# 导入所需的库from pandasai import SmartDataframe
from langchain.llms import OpenAI
# from langchain.llms import Anthropic# from langchain.llms import LlamaCpp# 创建一个OpenAI实例,传入你的API密钥和最大token数
langchain_llm = OpenAI(openai_api_key="YOUR TOKEN", max_tokens=1000)# 创建一个SmartDataframe实例,传入数据框和配置参数
langchain_sdf = SmartDataframe(df, config={"llm": langchain_llm})# 调用chat方法,向模型提问
langchain_sdf.chat("Which are the top 5 countries by GPD?")
连接器
PandasAI提供了许多连接器,允许您连接到不同的数据源。这些连接器被设计成易于使用,即使您对数据源或PandasAI不熟悉。
要使用连接器,您首先需要安装所需的依赖项。您可以通过运行以下命令来完成此操作:
# 安装pandasai[connectors]包
!pip install pandasai[connectors]
Requirement already satisfied: pandasai[connectors] in /usr/local/lib/python3.10/dist-packages (1.2.7)
Requirement already satisfied: astor<0.9.0,>=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (0.8.1)
Requirement already satisfied: duckdb<0.9.0,>=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (0.8.1)
Requirement already satisfied: ipython<9.0.0,>=8.13.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (8.15.0)
Requirement already satisfied: matplotlib<4.0.0,>=3.7.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (3.7.1)
Requirement already satisfied: openai<0.28.0,>=0.27.5 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (0.27.10)
Requirement already satisfied: pandas==1.5.3 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.5.3)
Requirement already satisfied: pydantic<2,>=1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.10.12)
Requirement already satisfied: python-dotenv<2.0.0,>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.0.0)
Requirement already satisfied: scipy<2.0.0,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.11.2)
Requirement already satisfied: sqlalchemy<2.0.0,>=1.4.49 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.4.49)
Requirement already satisfied: psycopg2<3.0.0,>=2.9.7 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (2.9.7)
Collecting pymysql<2.0.0,>=1.1.0 (from pandasai[connectors])
Downloading PyMySQL-1.1.0-py3-none-any.whl (44 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.8/44.8 kB[0m [31m709.3 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting snowflake-sqlalchemy<2.0.0,>=1.5.0 (from pandasai[connectors])
Downloading snowflake_sqlalchemy-1.5.0-py2.py3-none-any.whl (33 kB)
Collecting sqlalchemy-databricks<0.3.0,>=0.2.0 (from pandasai[connectors])
Downloading sqlalchemy_databricks-0.2.0-py3-none-any.whl (4.3 kB)
Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[connectors]) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[connectors]) (2023.3.post1)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[connectors]) (1.23.5)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.2.0)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (4.4.2)
Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.19.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.1.6)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (3.0.39)
Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (2.16.1)
Requirement already satisfied: stack-data in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.6.2)
Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (5.7.1)
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (1.1.3)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython<9.0.0,>=8.13.1->pandasai[connectors]) (4.8.0)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (4.42.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib<4.0.0,>=3.7.1->pandasai[connectors]) (3.1.1)
Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai[connectors]) (2.31.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai[connectors]) (4.66.1)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.5->pandasai[connectors]) (3.8.5)
Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<2,>=1->pandasai[connectors]) (4.5.0)
Collecting snowflake-connector-python<4.0.0 (from snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors])
Downloading snowflake_connector_python-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (24.6 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m34.3 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy<2.0.0,>=1.4.49->pandasai[connectors]) (2.0.2)
Collecting PyHive<1,>=0 (from sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading PyHive-0.7.0.tar.gz (46 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.5/46.5 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[?25h Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting databricks-sql-connector<3,>=2 (from sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading databricks_sql_connector-2.9.3-py3-none-any.whl (297 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m297.3/297.3 kB[0m [31m25.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting alembic<2.0.0,>=1.0.11 (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading alembic-1.12.0-py3-none-any.whl (226 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m226.0/226.0 kB[0m [31m22.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting lz4<5.0.0,>=4.0.2 (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading lz4-4.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m56.0 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: oauthlib<4.0.0,>=3.1.0 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (3.2.2)
Requirement already satisfied: openpyxl<4.0.0,>=3.0.10 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (3.1.2)
Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (9.0.0)
Collecting thrift<0.17.0,>=0.16.0 (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading thrift-0.16.0.tar.gz (59 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.6/59.6 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[?25h Preparing metadata (setup.py) ... [?25l[?25hdone
Requirement already satisfied: urllib3>=1.0 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (2.0.4)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.2.6)
Requirement already satisfied: future in /usr/local/lib/python3.10/dist-packages (from PyHive<1,>=0->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (0.18.3)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas==1.5.3->pandasai[connectors]) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai[connectors]) (3.2.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai[connectors]) (3.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.5->pandasai[connectors]) (2023.7.22)
Collecting asn1crypto<2.0.0,>0.24.0 (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors])
Downloading asn1crypto-1.5.1-py2.py3-none-any.whl (105 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: cffi<2.0.0,>=1.9 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (1.15.1)
Requirement already satisfied: cryptography<42.0.0,>=3.1.0 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (41.0.3)
Collecting oscrypto<2.0.0 (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors])
Downloading oscrypto-1.3.0-py2.py3-none-any.whl (194 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.6/194.6 kB[0m [31m18.8 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: pyOpenSSL<24.0.0,>=16.2.0 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (23.2.0)
Collecting pycryptodomex!=3.5.0,<4.0.0,>=3.2 (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors])
Downloading pycryptodomex-3.19.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m43.4 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: pyjwt<3.0.0 in /usr/lib/python3/dist-packages (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (2.3.0)
Collecting urllib3>=1.0 (from databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading urllib3-1.26.16-py2.py3-none-any.whl (143 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.1/143.1 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: filelock<4,>=3.5 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (3.12.2)
Requirement already satisfied: sortedcontainers>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (2.4.0)
Collecting platformdirs<3.9.0,>=2.6.0 (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors])
Downloading platformdirs-3.8.1-py3-none-any.whl (16 kB)
Collecting tomlkit (from snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors])
Downloading tomlkit-0.12.1-py3-none-any.whl (37 kB)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai[connectors]) (23.1.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai[connectors]) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai[connectors]) (4.0.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai[connectors]) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai[connectors]) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.5->pandasai[connectors]) (1.3.1)
Requirement already satisfied: executing>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython<9.0.0,>=8.13.1->pandasai[connectors]) (1.2.0)
Requirement already satisfied: asttokens>=2.1.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython<9.0.0,>=8.13.1->pandasai[connectors]) (2.4.0)
Requirement already satisfied: pure-eval in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython<9.0.0,>=8.13.1->pandasai[connectors]) (0.2.2)
Collecting Mako (from alembic<2.0.0,>=1.0.11->databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors])
Downloading Mako-1.2.4-py3-none-any.whl (78 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.7/78.7 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi<2.0.0,>=1.9->snowflake-connector-python<4.0.0->snowflake-sqlalchemy<2.0.0,>=1.5.0->pandasai[connectors]) (2.21)
Requirement already satisfied: et-xmlfile in /usr/local/lib/python3.10/dist-packages (from openpyxl<4.0.0,>=3.0.10->databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (1.1.0)
Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.10/dist-packages (from Mako->alembic<2.0.0,>=1.0.11->databricks-sql-connector<3,>=2->sqlalchemy-databricks<0.3.0,>=0.2.0->pandasai[connectors]) (2.1.3)
Building wheels for collected packages: PyHive, thrift
Building wheel for PyHive (setup.py) ... [?25l[?25hdone
Created wheel for PyHive: filename=PyHive-0.7.0-py3-none-any.whl size=53872 sha256=1d2a90767825eb44f25f15a386a3191b47df6b0fe2ee1c8b3718ad3c9e9c3592
Stored in directory: /root/.cache/pip/wheels/d3/fc/31/6974270c69ccc5bf8f848e2e41b527d0e8f5b9b973696a29a9
Building wheel for thrift (setup.py) ... [?25l[?25hdone
Created wheel for thrift: filename=thrift-0.16.0-cp310-cp310-linux_x86_64.whl size=373871 sha256=9b6ad9cfb506732a6582e3d6f87721c28d3255de8eeeadf47ed54445c360ad89
Stored in directory: /root/.cache/pip/wheels/52/f8/d2/acfd995e8247eb0cad372fa6a640a5fcf279ab2ed7c5c4490e
Successfully built PyHive thrift
Installing collected packages: asn1crypto, urllib3, tomlkit, thrift, pymysql, pycryptodomex, platformdirs, oscrypto, Mako, lz4, PyHive, alembic, databricks-sql-connector, sqlalchemy-databricks, snowflake-connector-python, snowflake-sqlalchemy
Attempting uninstall: urllib3
Found existing installation: urllib3 2.0.4
Uninstalling urllib3-2.0.4:
Successfully uninstalled urllib3-2.0.4
Attempting uninstall: platformdirs
Found existing installation: platformdirs 3.10.0
Uninstalling platformdirs-3.10.0:
Successfully uninstalled platformdirs-3.10.0
Successfully installed Mako-1.2.4 PyHive-0.7.0 alembic-1.12.0 asn1crypto-1.5.1 databricks-sql-connector-2.9.3 lz4-4.3.2 oscrypto-1.3.0 platformdirs-3.8.1 pycryptodomex-3.19.0 pymysql-1.1.0 snowflake-connector-python-3.2.0 snowflake-sqlalchemy-1.5.0 sqlalchemy-databricks-0.2.0 thrift-0.16.0 tomlkit-0.12.1 urllib3-1.26.16
# 导入MySQLConnector和PostgreSQLConnector类from pandasai.connectors import MySQLConnector, PostgreSQLConnector
# 使用MySQL数据库
loan_connector = MySQLConnector(
config={"host":"localhost",# 主机名"port":3306,# 端口号"database":"mydb",# 数据库名"username":"root",# 用户名"password":"root",# 密码"table":"loans",# 表名"where":[# 这是可选的,用于过滤数据以减少数据框的大小["loan_status","=","PAIDOFF"],# 过滤条件],})# 使用PostgreSQL数据库
payment_connector = PostgreSQLConnector(
config={"host":"localhost",# 主机名"port":5432,# 端口号"database":"mydb",# 数据库名"username":"root",# 用户名"password":"root",# 密码"table":"payments",# 表名"where":[# 这是可选的,用于过滤数据以减少数据框的大小["payment_status","=","PAIDOFF"],# 过滤条件],})# 创建SmartDatalake对象,将MySQLConnector和PostgreSQLConnector对象作为参数传入
df_connector = SmartDatalake([loan_connector, payment_connector], config={"llm": llm})# 调用chat方法,传入问题作为参数,返回答案
response = df_connector.chat("How many loans from the United states?")print(response)
# 导入YahooFinanceConnector模块from pandasai.connectors.yahoo_finance import YahooFinanceConnector
# 创建一个YahooFinanceConnector对象,参数为股票代码"MSFT"
yahoo_connector = YahooFinanceConnector("MSFT")# 使用YahooFinanceConnector对象创建一个SmartDataframe对象,同时传入配置参数{"llm": llm}
df = SmartDataframe(yahoo_connector, config={"llm": llm})# 使用SmartDataframe对象的chat方法进行对话,参数为询问昨天的收盘价
response = df.chat("What is the closing price for yesterday?")# 打印返回的结果print(response)
The closing price for yesterday was $319.53.
# 创建一个YahooFinanceConnector对象,传入参数为股票代码"TSLA"
yahoo_connector = YahooFinanceConnector("TSLA")# 创建一个SmartDataframe对象,传入参数为yahoo_connector和配置参数{"llm": llm}
df_connector = SmartDataframe(yahoo_connector, config={"llm": llm})# 调用df_connector的chat方法,传入参数为"Plot the chart of tesla over time",返回结果赋值给response
response = df_connector.chat("Plot the chart of tesla over time")
您可以在此处找到有关连接器(以及更多连接器)的更多信息:https://docs.pandas-ai.com/en/latest/connectors/
版权归原作者 数智笔记 所有, 如有侵权,请联系我们删除。