Sunday, July 20, 2025

Next-Gen AI: Multi-Agent LLMs and Policy Gradient RL (Explained)

Introduction. 

Artificial Intelligence (AI) is moving beyond single-task chatbots and into a future where multiple smart agents work together—and learn from their experiences. This new wave of AI is powered by Multi-Agent Large Language Models (LLMs) and Reinforcement Learning (RL). Let’s break down what this means, and why it matters for everyone.

What Are Multi-Agent LLMs?

If you’ve ever chatted with an AI like ChatGPT or Google Gemini, you’ve experienced a single “agent” at work. But imagine if you had a whole team of AI experts—each with a different specialty—collaborating to answer your questions or solve your problems.

That’s what Multi-Agent LLMs are: several AI “personalities” (like a general doctor, a specialist, and a risk manager) working together. They can ask each other questions, give advice, and debate the best answer—just like a real-world panel of experts.

What Is Reinforcement Learning?

Reinforcement Learning (RL) is how AI learns by doing. The AI agent tries actions, gets feedback (rewards for good decisions, penalties for mistakes), and gradually figures out the smartest way to act. It’s like how we learn new skills—trial and error, over many attempts.

Why Combine Them?

When we combine the “brainpower” of multiple LLM agents with the ability of RL to learn from experience, you get something powerful:

  • The AI agent learns to use advice from different experts, not just rely on one.

  • Over time, it gets better at making complex decisions—whether it’s diagnosing patients, handling business workflows, or answering tough questions.

  • The teamwork approach makes the system more robust, explainable, and safe.

A Real Example

In a recent AI project, we trained an agent to diagnose patient cases. It didn’t just rely on one answer—instead, it asked three LLM advisors (each playing a different medical role) for opinions, then decided what to do. As it learned from rewards and mistakes, its accuracy went up. That’s the magic of next-gen AI: collaborative, continuously learning, and smarter with every step.

Tutorial.


Code:

import numpy as np
import random
import keras
from keras import layers
from keras.optimizers import Adam
from keras.models import Model
from sentence_transformers import SentenceTransformer
import matplotlib.pyplot as plt
import requests
import time
import tensorflow as tf

# -------- Groq API Config --------
USE_REAL_LLM = True # Set False for mock/test
GROQ_API_KEY = "Use your key"
ENDPOINT = "https://api.groq.com/openai/v1/chat/completions"
MODEL = "llama3-70b-8192" # Or "llama3-8b-8192"

N_EPISODES = 10 # Lower for demo, increase for more training
EMBED_DIM = 32
N_ADVISORS = 3
N_ACTIONS = 5
GAMMA = 0.99

# ---- Synthetic Patient Dataset ----
patient_cases = [
(1, 1, 1, 'flu'),
(1, 0, 0, 'cold'),
(0, 1, 0, 'cold'),
(1, 1, 0, 'flu'),
(1, 0, 1, 'flu'),
(0, 1, 1, 'flu'),
(0, 0, 0, 'cold'),
(1, 0, 0, 'cold'),
(0, 0, 1, 'cold'),
]
def sample_case():
return random.choice(patient_cases)

# ---- Real Groq LLM API Adapter ----
def query_llm_groq(prompt, personality_name):
if not USE_REAL_LLM:
if personality_name == "Internist":
return "Stepwise testing is safest; treat if strong evidence only."
elif personality_name == "Specialist":
return "Rule out severe cases, do broad diagnostics."
elif personality_name == "Generalist":
return "Prioritize patient comfort and minimal intervention."
else:
return "No specific advice."
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {GROQ_API_KEY}"
}
system_prompt = f"You are a {personality_name} medical advisor. Return a one-sentence actionable recommendation for the case."
payload = {
"model": MODEL,
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt}
],
"max_tokens": 50,
"temperature": 0.3,
"n": 1
}
for attempt in range(3): # Retry on error
try:
response = requests.post(ENDPOINT, headers=headers, json=payload, timeout=20)
if response.status_code == 200:
out = response.json()
return out['choices'][0]['message']['content'].strip()
else:
print(f"Groq LLM error code {response.status_code}, retrying...")
time.sleep(2)
except Exception as e:
print(f"Groq Exception: {e}, retrying...")
time.sleep(2)
return "[LLM Error or Timeout]"

# --- LLM Advisors (Groq + Llama3, with negotiation) ---
def get_advisors(state, prev_advices=None):
personalities = ["Internist", "Specialist", "Generalist"]
advices = []
for idx, personality in enumerate(personalities):
prompt = f"Patient symptoms: fever={state[0]}, cough={state[1]}, risk factors={state[2]}."
if prev_advices:
prompt += f" Previous advisor opinions: {' | '.join(prev_advices)}"
prompt += " Revise or comment if needed."
advice = query_llm_groq(prompt, personality)
advices.append(advice)
return advices

# --- Patient Environment ---
class PatientEnv:
def reset(self):
fever, cough, risk, diag = sample_case()
self.state = [fever, cough, risk]
self.true_diagnosis = diag
return np.array(self.state, dtype=np.float32), diag
def step(self, action):
reward = 0; done = False
if action == 0: reward = -2 # order test
elif action == 1: # diagnose cold
if self.true_diagnosis == 'cold': reward = 10; done = True
else: reward = -10; done = True
elif action == 2: # diagnose flu
if self.true_diagnosis == 'flu': reward = 10; done = True
else: reward = -10; done = True
elif action == 3: reward = -2 # prescribe
elif action == 4: reward = 0; done = True # refer
else: reward = -5
return reward, done

# --- Embedding Model ---
embedder = SentenceTransformer('all-MiniLM-L6-v2')
def embed_sentences(sentences):
arr = embedder.encode(sentences)
if arr.shape[1] > EMBED_DIM:
arr = arr[:,:EMBED_DIM]
return arr

# --- Keras 3 RL Policy Network with Attention ---
def build_policy_network(state_dim, emb_dim, n_advisors, n_actions):
state_in = keras.Input(shape=(state_dim,), name="state")
advisor_emb_in = keras.Input(shape=(n_advisors, emb_dim), name="advisor_emb")
x = layers.TimeDistributed(layers.Dense(emb_dim, activation='relu'))(advisor_emb_in)
attn_scores = layers.TimeDistributed(layers.Dense(1))(x)
attn_scores_flat = layers.Flatten()(attn_scores)
attn_weights = layers.Activation('softmax', name='attn_weights')(attn_scores_flat)
attn_weights_exp = layers.Reshape((n_advisors, 1))(attn_weights)
advisor_context = layers.Dot(axes=1)([attn_weights_exp, x])
advisor_context = layers.Flatten()(advisor_context)
concat = layers.Concatenate()([state_in, advisor_context])
dense = layers.Dense(64, activation='relu')(concat)
out = layers.Dense(n_actions, activation='softmax')(dense)
model = keras.Model([state_in, advisor_emb_in], out)
# Model for extracting attention weights
attn_model = Model([state_in, advisor_emb_in], attn_weights)
return model, attn_model

# --- Training Loop: REINFORCE Policy Gradient ---
env = PatientEnv()
policy_net, attn_model = build_policy_network(3, EMBED_DIM, N_ADVISORS, N_ACTIONS)
optimizer = Adam(learning_rate=1e-3)

reward_history = []
for episode in range(N_EPISODES):
state, diag = env.reset()
episode_logprobs = []
episode_rewards = []
done = False
step = 0
while not done:
# Advisors: negotiation
advices = get_advisors(state)
advices = get_advisors(state, advices)
advisor_embs = embed_sentences(advices)
advisor_embs = advisor_embs[np.newaxis, ...]
state_batch = state[np.newaxis, ...]
# Policy step
probs = policy_net([state_batch, advisor_embs]).numpy()[0]
action = np.random.choice(N_ACTIONS, p=probs)
# Log-prob for policy gradient
logprob = np.log(probs[action] + 1e-8)
episode_logprobs.append(logprob)
# Step in env
reward, done = env.step(action)
episode_rewards.append(reward)
step += 1

# --- Policy gradient update (REINFORCE) ---
returns = []
G = 0
for r in reversed(episode_rewards):
G = r + GAMMA * G
returns.insert(0, G)
returns = np.array(returns)
returns = (returns - returns.mean()) / (returns.std() + 1e-8) # normalize

# Policy loss (one step, for demo)
with tf.GradientTape() as tape:
state, _ = env.reset()
advices = get_advisors(state)
advices = get_advisors(state, advices)
advisor_embs = embed_sentences(advices)
advisor_embs = advisor_embs[np.newaxis, ...]
state_batch = state[np.newaxis, ...]
probs = policy_net([state_batch, advisor_embs], training=True)[0]
loss = -tf.math.log(probs[action] + 1e-8) * returns[0]
grads = tape.gradient(loss, policy_net.trainable_weights)
optimizer.apply_gradients(zip(grads, policy_net.trainable_weights))

reward_history.append(np.sum(episode_rewards))

# Print logs
if episode < 3 or episode % 5 == 0:
action_names = ["Order test", "Diagnose cold", "Diagnose flu", "Prescribe", "Refer"]
attn_vals = attn_model([state_batch, advisor_embs]).numpy()[0]
top_advisor = np.argmax(attn_vals)
print(f"\n--- Episode {episode} ---")
print(f"Patient: fever={state[0]}, cough={state[1]}, risk={state[2]}, true_diag={diag}")
for i, a in enumerate(advices):
print(f"Advisor {i+1}: {a}")
print(f"Agent chose: {action_names[action]} (Reward: {reward})")
print(f"Attention: Advisor {top_advisor+1} most influential ({attn_vals[top_advisor]:.2f})")

# --- Plot reward vs. episode ---
plt.plot(reward_history)
plt.xlabel("Episode")
plt.ylabel("Reward")
plt.title("Keras 3 RL Agent + Groq Llama3 LLM Advisors: Reward vs. Episode")
plt.show()

Reference:

  1. Yao, S., Zhao, X., et al. "Tree of Thoughts: Deliberate Problem Solving with Large Language Models," arXiv preprint arXiv:2305.10601, 2023.
  2. Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, Bernard Ghanem. "CAMEL: Communicative Agents for 'Mind' Exploration of Large Scale Language Model Society," arXiv preprint arXiv:2303.17760, 2023.
  3. Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang. "AutoGen: Enabling next-generation multi-agent LLM applications," arXiv preprint arXiv:2308.08155, 2023.
  4. Sutton, R. S., & Barto, A. G. "Reinforcement Learning: An Introduction," 2nd Edition, MIT Press, 2018.
  5. Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, Anima Anandkumar. "Voyager: An Open-Ended Embodied Agent with Large Language Models," arXiv preprint arXiv:2305.16291, 2023.
  6. Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. "Reflexion: Language Agents with Verbal Reinforcement Learning," arXiv preprint arXiv:2303.11366, 2023.

Saturday, July 12, 2025

AI vs Human Writing Robust Hybrid LLM Aided Detection ( source code)

Introduction.

Can you really tell if a text is written by a human—or by ChatGPT? In this step-by-step tutorial, discover the secrets of hybrid AI detection: combining advanced statistical analysis with the power of Large Language Models (LLMs) to confidently distinguish human writing from AI-generated content. You’ll learn: How semantic, structural, and entropy features reveal AI text Why LLM meta-classification (self-consistency voting) beats single-method detection How adversarial tricks try to fool detectors—and how to spot them.

Working Code.

import os
import groq
import numpy as np
import nltk
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
from collections import Counter
from nltk.util import ngrams
import re

nltk.download('punkt')

groq.api_key = os.getenv('Input your key') or "Input Your Key"

############################
# Statistical Feature Extraction
############################

def split_paragraphs(text):
return [p.strip() for p in text.split('\n') if p.strip()]

def sentences(text):
return nltk.sent_tokenize(text)

def complex_sentence_ratio(text, threshold=20):
sents = sentences(text)
return sum(1 for s in sents if len(s.split()) > threshold) / max(1, len(sents))

def entropy(ngram_list):
total = sum(ngram_list.values())
if total == 0:
return 0.0
probs = np.array(list(ngram_list.values())) / total
return -np.sum(probs * np.log2(probs + 1e-12))

def get_entropy(text, n):
words = nltk.word_tokenize(text)
if len(words) < n:
return 0.0
ngrams_list = list(ngrams(words, n))
counts = Counter(ngrams_list)
return entropy(counts)

def semantic_consistency(text, model):
chunks = split_paragraphs(text)
if len(chunks) < 2:
return 0.5 # fallback for short text
embeddings = model.encode(chunks)
similarities = [cosine_similarity([embeddings[i]], [embeddings[i+1]])[0][0] for i in range(len(embeddings)-1)]
return float(np.mean(similarities))

def structural_complexity(text, L_ref=17, w1=0.3, w2=0.6, w3=0.1):
sents = sentences(text)
lens = [len(nltk.word_tokenize(s)) for s in sents]
if not lens:
return 0.0
L_avg = np.mean(lens)
L_var = np.var(lens)
F_cmplx = complex_sentence_ratio(text)
S_struc = w1*L_var + w2*F_cmplx - w3*abs(L_avg - L_ref)
return float(S_struc)

def linguistic_entropy(text):
H1 = get_entropy(text, 1)
H2 = get_entropy(text, 2)
H3 = get_entropy(text, 3)
return float(np.mean([H1, H2, H3]))

def SSDD_score(text, model, alpha1=2, alpha2=1, alpha3=1):
S_sem = semantic_consistency(text, model)
S_struc = structural_complexity(text)
S_entropy = linguistic_entropy(text)
z = alpha1 * S_sem - alpha2 * S_struc - alpha3 * S_entropy
stat_prob = 1 / (1 + np.exp(-z))
return stat_prob, S_sem, S_struc, S_entropy

############################
# LLM Meta-Classification (with Self-Consistency)
############################

LLM_VOTING_PROMPTS = [
"Here is a text sample:\n{text}\n\nDo you think this was written by a human or by an AI assistant (like AI or Llama3)? Please respond with:\n- Answer: [Human/AI]\n- Probability: [0.0 - 1.0]\n- Explanation: [your explanation]",
"Read the following sample and estimate if it was written by a person or a language model like AI/Llama. Reply only as:\nAnswer: [Human/AI]\nProbability: [number]\nExplanation: [reason]\n\nSample:\n{text}",
"Given this passage, tell me if it's most likely AI or human generated. Provide your guess, a probability (0-1), and your main reason.\n\n{text}"
]

def parse_llm_response(output):
# Try to robustly parse: Answer, Probability, Explanation
try:
answer_match = re.search(r'Answer:\s*(AI|Human)', output, re.IGNORECASE)
prob_match = re.search(r'Probability:\s*([0-9.]+)', output)
explanation_match = re.search(r'Explanation:(.*)', output, re.DOTALL | re.IGNORECASE)
llm_pred = "AI" if answer_match and "AI" in answer_match.group(1).upper() else "Human"
llm_prob = float(prob_match.group(1)) if prob_match else 0.5
explanation = explanation_match.group(1).strip() if explanation_match else output.strip()
except Exception as e:
llm_pred, llm_prob, explanation = "Unknown", 0.5, output.strip()
return llm_pred, llm_prob, explanation

def llm_detection_self_consistency(text, model_name="llama3-70b-8192", n_prompts=3):
llm_probs, llm_preds, explanations = [], [], []
for i in range(n_prompts):
prompt = LLM_VOTING_PROMPTS[i % len(LLM_VOTING_PROMPTS)].format(text=text)
try:
response = groq.ChatCompletion.create(
model=model_name,
messages=[{"role": "user", "content": prompt}],
max_tokens=300,
temperature=0
)
output = response['choices'][0]['message']['content']
except Exception as ex:
output = "Answer: Unknown\nProbability: 0.5\nExplanation: API error: " + str(ex)
llm_pred, llm_prob, explanation = parse_llm_response(output)
llm_probs.append(llm_prob)
llm_preds.append(llm_pred)
explanations.append(explanation)
# Majority vote, or average probability
avg_prob = np.mean(llm_probs)
maj_pred = "AI" if llm_preds.count("AI") >= n_prompts//2+1 else "Human"
concat_explanation = "\n\n".join([f"Prompt {i+1}: {exp}" for i, exp in enumerate(explanations)])
return maj_pred, avg_prob, concat_explanation, llm_probs, llm_preds

############################
# Robust Hybrid Ensemble Detection (RHLAD)
############################

def adaptive_threshold(stat_probs, base=0.7):
# Set threshold to max(base, mean+std/2): more robust in real-world text
if not stat_probs:
return base
return float(max(base, np.mean(stat_probs) + np.std(stat_probs)/2))

def RHLAD_analyze(texts, model, alpha=0.5, beta=0.5, llm_model="llama3-70b-8192"):
results = []
stat_probs_all = []
# Precompute all stat scores for thresholding
for text in texts:
stat_P_AI, _, _, _ = SSDD_score(text, model)
stat_probs_all.append(stat_P_AI)
threshold = adaptive_threshold(stat_probs_all, base=0.7)
for idx, text in enumerate(texts):
stat_P_AI, S_sem, S_struc, S_entropy = SSDD_score(text, model)
llm_pred, llm_prob, explanation, llm_probs_v, llm_preds_v = llm_detection_self_consistency(
text, model_name=llm_model, n_prompts=3
)
combined_score = alpha * llm_prob + beta * stat_P_AI
prediction = "AI" if combined_score > threshold else "Human"
# Adversarial defense: If statistical score is extremely high (>0.9), flag as suspicious even if LLM disagrees
adversarial_flag = (stat_P_AI > 0.9 and llm_prob < 0.5)
results.append({
'index': idx,
'stat_P_AI': stat_P_AI,
'S_sem': S_sem,
'S_struc': S_struc,
'S_entropy': S_entropy,
'llm_prob': llm_prob,
'llm_probs_voting': llm_probs_v,
'llm_preds_voting': llm_preds_v,
'combined_score': combined_score,
'prediction': prediction,
'adversarial_flag': adversarial_flag,
'llm_explanation': explanation,
'threshold': threshold
})
return results

############################
# Sample Usage
############################

if __name__ == '__main__':
texts = [
# Human sample
"When I woke up this morning, the sky was a pale blue and birds sang outside my window. I remembered my childhood days, full of laughter and chaos, and decided to write a letter to my old friend.",
# AI Generated sample
"Artificial intelligence, particularly language models like AI, have transformed the way we interact with technology. These models are trained on vast amounts of text data and can generate human-like responses to a wide range of queries.",
# Paraphrased AI (try to fool the system)
"Leveraging massive datasets, today's AI models craft responses that feel increasingly human. Our interactions with technology have been revolutionized, thanks to these powerful language tools."
]
print("Loading embedding model...")
model = SentenceTransformer('all-MiniLM-L6-v2')

print("\nRunning Robust Hybrid LLM-Aided Detection (RHLAD) using Groq Llama3...")
results = RHLAD_analyze(texts, model, alpha=0.5, beta=0.5, llm_model="llama3-70b-8192")

for res in results:
print(f"\nSample #{res['index']+1} — Final Prediction: {res['prediction']} (Combined={res['combined_score']:.2f})")
print(f" Statistical SSDD Score: {res['stat_P_AI']:.2f} (Semantic={res['S_sem']:.2f}, Structure={res['S_struc']:.2f}, Entropy={res['S_entropy']:.2f})")
print(f" LLM Meta Probability (avg voting): {res['llm_prob']:.2f}")
print(f" LLM Voting Details: {res['llm_preds_voting']}, Probs={res['llm_probs_voting']}")
print(f" Explanation:\n{res['llm_explanation']}")
print(f" Adaptive Threshold used: {res['threshold']:.2f}")
if res['adversarial_flag']:
print(" [!ADVERSARIAL WARNING!] — Statistical and LLM signals disagree: possible paraphrased AI.")
print("-" * 70)


References.

  1. Solaiman, I., Brundage, M., Clark, J., et al. "Release Strategies and the Social Impacts of Language Models." arXiv preprint arXiv:1908.09203, 2019.
  2. Bakhtin, A., Deng, Y., Ott, M., et al."Real or Fake? Learning to Discriminate Machine from Human Generated Text." arXiv preprint arXiv:1906.03351, 2019.
  3. Ippolito, D., Duckworth, D., Callison-Burch, C., & Eck, D. "Automatic Detection of Generated Text is Easiest when Humans are Fooled." Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 2020, pp. 1808–1822.
  4. Jawahar, G., Sagot, B., & Seddah, D. "Automatic Detection of Machine Generated Text: A Critical Survey." arXiv preprint arXiv:2005.08512, 2020.
  5. Kreps, S., McCain, R. M., & Brundage, M. "All the News That's Fit to Fabricate: AI-Generated Text as a Tool of Media Misinformation." Journal of Experimental Political Science, vol. 10, no. 2, 2023, pp. 233–244.
  6. Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., Raji, I. D., & Gebru, T. "Model Cards for Model Reporting." Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT '19)*, 2019, pp. 220–229.

Saturday, February 15, 2025

Enhancing Algorithmic Trading with Neuro-Symbolic AI: A Hybrid Approach for Smarter Market Predictions

 



Introduction to Neuro-Symbolic AI

Neuro-Symbolic AI is an advanced paradigm that combines neural networks (deep learning) with symbolic reasoning to enhance decision-making, interpretability, and generalization. While neural models excel at pattern recognition from unstructured data, symbolic AI uses logic-based rules to perform reasoning and inference. The integration of these two approaches results in more robust, explainable, and data-efficient AI systems.

Use of Neuro-Symbolic AI in Algorithmic Trading

In algorithmic trading, Neuro-Symbolic AI plays a crucial role in improving market predictions and trade execution by:

  • Combining deep learning (e.g., Transformer-based forecasting) with symbolic reasoning (e.g., sentiment analysis, volatility rules).
  • Adjusting price forecasts dynamically based on real-time sentiment analysis of news articles.
  • Using logic-driven volatility factors to refine predictions and mitigate market noise.
  • Enhancing risk management by integrating rule-based adjustments to model-driven outputs.


Summary of the Code

The provided code implements an algorithmic trading model enhanced with Neuro-Symbolic AI techniques:

  1. Sentiment Analysis Using Llama3.1: Extracts sentiment scores for two stocks based on local news data.
  2. Stock Data Generation: Simulates stock prices using a Multivariate Gaussian distribution.
  3. i-Transformer Model for Forecasting: A deep learning model based on a Transformer encoder predicts future stock prices using a sliding window of past prices.
  4. Neuro-Symbolic Adjustment: Adjusts the model’s predictions using sentiment scores and a volatility factor.
  5. Simulated Trading Decisions: Iterates through test data, predicting stock prices and applying symbolic adjustments to improve trading decisions.

This hybrid approach enhances algorithmic trading by incorporating both data-driven (neural) and logic-driven (symbolic) methodologies, leading to more accurate and interpretable market predictions.

Code: Algorithmic Trading using Neuro-Symbolic AI (a toy example):

 import numpy as np

import pandas as pd
import tensorflow as tf
import re
# from tensorflow import keras
from keras.models import Model
from keras.layers import Input, Dense, Dropout, MultiHeadAttention, LayerNormalization, \
GlobalAveragePooling1D, Conv1D, Add
import json
import requests
from groq import Groq

client = Groq(
api_key='PLEASE USE YOUR OWN API KEY',
)
stock1 = "Stock_A"
stock2 = "Stock_B"
# Step 1: Fetch News from Local Storage and Use Llama 3.1 for Sentiment Analysis
def fetch_news_from_local(stock_name):
with open(stock_name, 'r') as file:
news_data = json.load(file)
return news_data # Assume the JSON file contains text news data
context1 = str(fetch_news_from_local("news_data_Stock_A.json")) + ":\n"
question1 = "calculate the sentiment score in the range [-1.0 to +1.0] for " + stock1 +"return only number nothing else"
context2 = str(fetch_news_from_local("news_data_Stock_B.json")) + ":\n"
question2 = "calculate the sentiment score in the range [-1.0 to +1.0] for " + stock2 +"return only number nothing else"
# Combine context and question
input_prompt1 = f"{context1}\nQuestion: {question1}\nAnswer:"
input_prompt2 = f"{context2}\nQuestion: {question2}\nAnswer:"
# Step 2: Calculate the sentiment
def extract_sentiment_score(input_prompt):
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": input_prompt,
}
],
model="llama-3.1-8b-instant",
# model="llama-3.1-70b-versatile",
# model = "llama3-70b-8192"
)
response = chat_completion.choices[0].message.content

# Use regex to extract the first numerical value from the response
match = re.search(r'[-+]?\d*\.\d+|\d+', response) # Extracts numbers including negatives & decimals
if match:
return float(match.group()) # Convert extracted number to float
else:
raise ValueError(f"Could not extract a valid sentiment score from response: {response}")

llm_output1 = extract_sentiment_score(input_prompt1)
# llm_output1 = float(llm_output1)
print(llm_output1)
llm_output2 = extract_sentiment_score(input_prompt2)
# llm_output2 = float(llm_output2)
print(llm_output2)

# Step 3: Generate Synthetic Stock Data (Multivariate Gaussian)
def generate_stock_data(days=180, mean=[100, 200], cov=[[1, 0.5], [0.5, 1]]):
np.random.seed(42)
stock_data = np.random.multivariate_normal(mean, cov, size=days)
df = pd.DataFrame(stock_data, columns=['Stock_A', 'Stock_B'])
df['Date'] = pd.date_range(start='2024-01-01', periods=days, freq='D')
return df

stock_df = generate_stock_data()

# Step 4: Prepare Data for i-Transformer
def prepare_data(df, window=10):
data = df[['Stock_A', 'Stock_B']].values
X, y = [], []
for i in range(len(data) - window):
X.append(data[i:i + window])
y.append(data[i + window])
return np.array(X), np.array(y)


X, y = prepare_data(stock_df)
train_size = int(0.8 * len(X))
X_train, X_test, y_train, y_test = X[:train_size], X[train_size:], y[:train_size], y[train_size:]

print("X_train => ",X_train)
print("X_test => ",X_test)

def i_transformer_encoder(inputs, head_size=64, num_heads=4, ff_dim=128, dropout=0.1):
# Apply Conv1D layer to adjust the shape
x = Conv1D(filters=head_size, kernel_size=1, activation='relu')(inputs)

# Apply MultiHeadAttention
attn_output = MultiHeadAttention(key_dim=head_size, num_heads=num_heads)(x, x)
attn_output = Dropout(dropout)(attn_output)
attn_output = LayerNormalization(epsilon=1e-6)(attn_output)

# Ensure that attention output has the same shape as 'x' before addition
attn_output = Dense(head_size)(attn_output)

# Residual connection
res = Add()([x, attn_output])

# Feed-forward layers
x = Dense(ff_dim, activation="relu")(res)
x = Dropout(dropout)(x)
x = Dense(head_size)(x)
x = LayerNormalization(epsilon=1e-6)(x)

return Add()([x, res])

input_shape = (X.shape[1], X.shape[2])
inputs = Input(shape=input_shape)
x = i_transformer_encoder(inputs)
x = GlobalAveragePooling1D()(x)
x = Dense(2)(x)

model = Model(inputs, x)
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=20, batch_size=16, validation_data=(X_test, y_test))

# Step 6: Advanced Neuro-Symbolic AI Adjustment (Updated)
def neuro_symbolic_adjustment(forecast, input_prompt):
# Analyze sentiment for both stocks
sentiment_1 = extract_sentiment_score(input_prompt)
# Compute volatility factors for both stocks
volatility_factor_1 = np.std([np.random.uniform(-0.05, 0.05) for _ in range(10)]) # Simulated volatility for Stock 1

symbolic_adjustment = sentiment_1 * 0.05 + volatility_factor_1 # Adjust forecast with both factors

# Adjust forecast values
adjusted_forecast = forecast * (1 + symbolic_adjustment)
return max(0, adjusted_forecast) # Ensure price does not go negative

# Step 7: Simulate Trading Decisions
import csv

# Step 7: Simulate Trading Decisions & Save to CSV
results = []

for i in range(len(y_test)):
forecast = model.predict(X_test[i:i + 1])[0]

# Raw Predictions
raw_forecast_A = forecast[0]
raw_forecast_B = forecast[1]

# Adjusted Predictions using Neuro-Symbolic AI
adjusted_forecast_A = neuro_symbolic_adjustment(raw_forecast_A, input_prompt1)
adjusted_forecast_B = neuro_symbolic_adjustment(raw_forecast_B, input_prompt2)

# Calculate the percentage change due to neuro-symbolic AI
change_A = ((adjusted_forecast_A - raw_forecast_A) / raw_forecast_A) * 100
change_B = ((adjusted_forecast_B - raw_forecast_B) / raw_forecast_B) * 100

# Print results
print(f"Day {i + 1}: Stock_A Raw: {raw_forecast_A:.2f}, Adjusted: {adjusted_forecast_A:.2f}{change_A:.2f}%)")
print(f" Stock_B Raw: {raw_forecast_B:.2f}, Adjusted: {adjusted_forecast_B:.2f}{change_B:.2f}%)")

# Append results to list for saving
results.append([i + 1, raw_forecast_A, adjusted_forecast_A, change_A, raw_forecast_B, adjusted_forecast_B, change_B])

# Save results to CSV
csv_filename = "trading_predictions.csv"
with open(csv_filename, mode="w", newline="") as file:
writer = csv.writer(file)
writer.writerow(["Day", "Stock_A_Raw", "Stock_A_Adjusted", "Stock_A_Change(%)",
"Stock_B_Raw", "Stock_B_Adjusted", "Stock_B_Change(%)"])
writer.writerows(results)

print(f"\nPredictions saved to {csv_filename}")


Code: Dummy News Data Generator.

import json


def generate_dummy_news(stock_name):
"""
Generates dummy news data for a given stock and saves it as a JSON file.
:param stock_name: Name of the stock for which dummy news data is generated.
"""
news_samples = {
"Stock_A": [
"Stock_A sees a surge in trading volume due to positive earnings report.",
"Analysts predict continued growth for Stock_A in the coming quarter.",
"Stock_A's new product launch receives strong market approval."
],
"Stock_B": [
"Stock_B faces regulatory scrutiny leading to a drop in stock value.",
"Market experts suggest caution as Stock_B struggles with supply chain issues.",
"Stock_B's latest acquisition expected to boost long-term profits."
]
}

with open(f'news_data_{stock_name}.json', 'w') as file:
json.dump(news_samples.get(stock_name, []), file, indent=4)
print(f"Dummy news data for {stock_name} saved.")


# Generate dummy news for Stock_A and Stock_B
generate_dummy_news("Stock_A")
generate_dummy_news("Stock_B")

Reference:

  1. Besold, T. R., et al. (2017). Neural-Symbolic Learning and Reasoning: A Survey and Interpretation.
  2. Garcez, A. S., & Lamb, L. C. (2020). Neurosymbolic AI: The 3rd Wave.
  3. Bengio, Y., et al. (2021). The Neuro-Symbolic Concept Learner.
  4. Mitchell, M. (2019). Artificial Intelligence: A Guide to Intelligent Systems.

Saturday, July 13, 2024

Finetune Large Language Models with DoRA (Weight-Decomposed Low-Rank Adaptation)

 

Introduction.

The DoRA (Weight-Decomposed Low-Rank Adaptation) algorithm offers an advanced approach to fine-tuning Large Language Models (LLMs) by decomposing weight matrices into magnitude and direction components. Traditional methods like Low-Rank Adaptation (LoRA) improve parameter efficiency but often face performance and stability trade-offs. DoRA addresses these issues by leveraging the Frobenius norm to separate the weight matrix into a stable magnitude and a fine-tuned direction. This decomposition ensures efficient learning while maintaining model expressiveness and stability. Key advantages of DoRA include enhanced parameter efficiency, improved generalization, faster adaptation to new tasks, and minimal inference overhead.

Key Points:

  • Decomposes weights into magnitude and direction.
  • Enhances parameter efficiency without compromising performance.
  • Improves training stability and generalization.
  • Facilitates faster adaptation to new tasks.
  • Maintains efficient inference with minimal overhead.

Video Tutorial.

Code: Finetune Large Language Models with DoRA (Train).

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, Trainer
from peft import LoraConfig, get_peft_model
from datasets import Dataset
import transformers
# pip install peft
# Sample QA Data
data = {
'question': [
"What is the capital of France?",
"Who painted the Mona Lisa?",
"What is the tallest mountain in the world?",
"When did World War II end?",
"Who wrote the play 'Romeo and Juliet'?",
"What is the chemical symbol for gold?"
],
'context': [
"Paris is the capital and most populous city of France.",
"The Mona Lisa is a half-length portrait painting by Italian Renaissance artist Leonardo da Vinci.",
"Mount Everest is Earth's highest mountain above sea level, located in the Mahalangur Himal sub-range of the Himalayas.",
"World War II (WWII or WW2), also known as the Second World War, was a global war that lasted from 1939 to 1945.",
"Romeo and Juliet is a tragedy written by William Shakespeare early in his career about two young star-crossed lovers whose deaths ultimately reconcile their feuding families.",
"Gold is a chemical element with the symbol Au and atomic number 79. In its purest form, it is a bright, slightly reddish yellow, dense, soft, malleable, and ductile metal."
],
'answer': [
"Paris",
"Leonardo da Vinci",
"Mount Everest",
"1945",
"William Shakespeare",
"Au"
]
}
dataset = Dataset.from_dict(data)

# Load Llama Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained("D:\\OLLAMA_MODELS\\meta-llama\\Meta-Llama-3-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("D:\\OLLAMA_MODELS\\meta-llama\\Meta-Llama-3-8B-Instruct")
# Ensure padding token is set
if tokenizer.pad_token is None:
tokenizer.add_special_tokens({'pad_token': '[PAD]'})
# Configure LoRA
peft_config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
use_dora =True
)

# Create PEFT Model
model = get_peft_model(model, peft_config)

# Preprocess Data
def generate_prompt(data_point):
return f"""[INST] {data_point["question"]} [/INST] {data_point["context"]} {data_point["answer"]} [/INST]"""

dataset = dataset.map(lambda data_point: {"text": generate_prompt(data_point)})

# Tokenize Data
def tokenize(prompt):
result = tokenizer(prompt["text"])
return {
"input_ids": result["input_ids"],
"attention_mask": result["attention_mask"],
}
tokenized_dataset = dataset.map(tokenize, batched=True, remove_columns=dataset.column_names)

# Training Arguments (Optimized for CPU)
training_args = TrainingArguments(
per_device_train_batch_size=1, # Very small batch size for CPU
gradient_accumulation_steps=8, # Accumulate gradients over multiple steps
num_train_epochs=3,
learning_rate=1e-4, # Smaller learning rate for CPU
logging_steps=10,
output_dir="./llama-3-finetuned-qa-cpu",
)

# Create Trainer
trainer = Trainer(
model=model,
train_dataset=tokenized_dataset,
args=training_args,
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)

# Fine-tune!
model.config.use_cache = False
trainer.train()

# Save the Fine-tuned Model
model.save_pretrained("./llama-3-finetuned-qa-cpu")

Code: Finetune Large Language Models with DoRA (Test).

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

# Load Fine-Tuned Model and Tokenizer
model_path = "E:\\Niraj_Work\\DL_Projects\\llm_projects\\llm_advance_1\\llama-3-finetuned-qa-cpu" # Path to your saved model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

# Ensure Model is on CPU
device = torch.device("cpu")
model.to(device)
if tokenizer.pad_token is None:
# tokenizer.add_special_tokens({'pad_token': '[PAD]'})
tokenizer.pad_token = tokenizer.eos_token
# Load Your Question-Answering Dataset (Replace with your dataset)
# Assuming you have a list of dictionaries, each with 'question', 'context', and 'answer' keys
eval_data = [
{"question": "What is the capital of France?", "context": "Paris is the capital and most populous city of France.", "answer": "Paris"},
{"question": "Who painted the Mona Lisa?", "context": "The Mona Lisa is a half-length portrait painting by Italian Renaissance artist Leonardo da Vinci.", "answer": "Leonardo da Vinci"},
]

# Function to generate the prompt
def generate_prompt(data_point):
return f"""[INST] {data_point["question"]} [/INST] {data_point["context"]} {data_point["answer"]} [/INST]"""


# Test the Model
for data_point in eval_data:
input_text = generate_prompt(data_point)
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device) # Move input to CPU

# Generate Answer
generation_output = model.generate(
input_ids=input_ids,
max_new_tokens=50, # Adjust as needed
num_beams=1, # You can try increasing num_beams if you have enough memory
early_stopping=True,
)

# Extract and Print Answer
generated_answer = tokenizer.decode(generation_output[0])
print(f"Question: {data_point['question']}")
print(f"Generated Answer: {generated_answer.split('[/INST]')[-2].strip()}")
print(f"Actual Answer: {data_point['answer']}")

Reference.

  1. Liu, Shih-Yang, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, and Min-Hung Chen. "Dora: Weight-decomposed low-rank adaptation." arXiv preprint arXiv:2402.09353 (2024).
  2. https://huggingface.co/papers/2402.09353
  3. https://www.nirajai.com/home/llm