This commit fixes a race condition in the LLM callback system where
multiple LLM instances calling set_callbacks concurrently could cause
callbacks to be removed before they fire.
Changes:
- Add class-level RLock (_callback_lock) to LLM class to synchronize
access to global litellm callbacks
- Wrap callback registration and LLM call execution in the lock for
both call() and acall() methods
- Use RLock (reentrant lock) to handle recursive calls without deadlock
(e.g., when retrying with unsupported 'stop' parameter)
- Remove sleep(5) workaround from test_llm_callback_replacement test
- Add new test_llm_callback_lock_prevents_race_condition test to verify
concurrent callback access is properly synchronized
Fixes#4214
Co-Authored-By: João <joao@crewai.com>