* Updated excel_knowledge_source.py to account for excel sheets that have multiple tabs. The old implementation contained a single df=pd.read_excel(excel_file_path), which only reads the first or most recently used excel sheet. The updated functionality reads all sheets in the excel workbook.
* updated load_content() function in excel_knowledge_source.py to reduce memory usage and provide better documentation
* accidentally didn't delete the old load_content() function in last commit - corrected this
* Added an override for the content field from the inheritted BaseFileKnowledgeSource to account for the change in the load_content method to support excel files with multiple tabs/sheets. This change should ensure it passes the type check test, as it failed before since content was assigned a different type in BaseFileKnowledgeSource
* Now removed the commented out imports in _import_dependencies, as requested
* Updated excel_knowledge_source to fix linter errors and type errors. Changed inheritence from basefileknowledgesource to baseknowledgesource because basefileknowledgesource's types conflicted (in particular the load_content function and the content class variable.
---------
Co-authored-by: Lorenze Jay <63378463+lorenzejay@users.noreply.github.com>
* drop metadata requirement
* fix linting
* Update docs for new knowledge
* more linting
* more linting
* make save_documents private
* update docs to the new way we use knowledge and include clearing memory
* Knowledge project directory standard
* fixed types
* comment fix
* made base file knowledge source an abstract class
* cleaner validator on model_post_init
* fix type checker
* cleaner refactor
* better template
* initial knowledge
* WIP
* Adding core knowledge sources
* Improve types and better support for file paths
* added additional sources
* fix linting
* update yaml to include optional deps
* adding in lorenze feedback
* ensure embeddings are persisted
* improvements all around Knowledge class
* return this
* properly reset memory
* properly reset memory+knowledge
* consolodation and improvements
* linted
* cleanup rm unused embedder
* fix test
* fix duplicate
* generating cassettes for knowledge test
* updated default embedder
* None embedder to use default on pipeline cloning
* improvements
* fixed text_file_knowledge
* mypysrc fixes
* type check fixes
* added extra cassette
* just mocks
* linted
* mock knowledge query to not spin up db
* linted
* verbose run
* put a flag
* fix
* adding docs
* better docs
* improvements from review
* more docs
* linted
* rm print
* more fixes
* clearer docs
* added docstrings and type hints for cli
---------
Co-authored-by: João Moura <joaomdmoura@gmail.com>
Co-authored-by: Lorenze Jay <lorenzejaytech@gmail.com>