Shahzad Bhatti Welcome to my ramblings and rants!

October 30, 2025

Agentic AI for Personal Productivity: Building a Daily Minutes Assistant with RAG, MCP, and ReAct

Filed under: Agentic AI — admin @ 8:20 pm

Over the last year, I have been applying Agentic AI to various problems at work and to improve personal productivity. For example, every morning, I faced the same challenge: information overload.

My typical morning looked like this:

  • ? Check emails and sort out what’s important
  • ? Check my calendar and figure out which ones are critical
  • ? Skim HackerNews, TechCrunch, newsletters for any important insight
  • ? Check Slack for any critical updates
  • ?? Look up weather ? Should I bring an umbrella or jacket?
  • ? Already lost 45 minutes just gathering information!

I needed an AI assistant that could digest all this information while I shower, then present me with a personalized 3-minute brief highlighting what actually matters. Also, following were key constraints for this assistant:

  • ? Complete privacy – My emails and calendar shouldn’t leave my laptop and I didn’t want to run any MCP servers in cloud that could expose my private credentials
  • ? Zero ongoing costs – Running complex Agentic workflow on the hosted environments could easily cost me hundreds of dollars a month
  • ? Fast iteration – Test changes instantly during development
  • ? Flexible deployment – Start local, deploy to cloud when ready

I will walk through my journey of building Daily Minutes with Claude Code – a fully functional agentic AI system that runs on my laptop using local LLMs, saves me 30 minutes every morning.


Agentic Building Blocks

I applied following building blocks to create this system:

  1. MCP (Model Context Protocol) – connecting to data sources discoverable by AI
  2. RAG (Retrieval-Augmented Generation) – give AI long-term memory
  3. ReAct Pattern – teach AI to reason before acting
  4. RLHF (Reinforcement Learning from Human Feedback) – teach AI from my preferences
  5. LangGraph – orchestrate complex multi-agent workflows
  6. 3-Layer Architecture – building easily extensible systems

Full source code: github.com/bhatti/daily-minutes

Let me walk you through how I built each piece, the problems I encountered, and how I solved them.


High-level Architecture

After several iterations, I landed on a clean 3-layer architecture:

Why this architecture worked for me:

Layer 1 (Data Sources) – I used MCP to make connectors pluggable. When I later wanted to add RSS feeds, I just registered a new tool – no changes to the AI logic.

Layer 2 (Intelligence) – This is where the magic happens. The ReAct agent reasons about what data it needs, LangGraph orchestrates fetching from multiple sources in parallel, RAG provides historical context, and RLHF learns from my feedback.

Layer 3 (UI) – I kept the UI simple and fast. It reads from a database cache, so it loads instantly – no waiting for AI to process.

How the Database Cache Works

This is a key architectural decision that made the UI lightning-fast:

# src/services/startup_service.py
async def preload_daily_data():
    """Background job that generates brief and caches in database."""

    # 1. Fetch all data in parallel (LangGraph orchestration)
    data = await langgraph_orchestrator.fetch_all_sources()

    # 2. Generate AI brief (ReAct agent with RAG)
    brief = await brief_generator.generate(
        emails=data['emails'],
        calendar=data['calendar'],
        news=data['news'],
        weather=data['weather']
    )

    # 3. Cache everything in SQLite
    await db.set_cache('daily_brief_data', brief.to_dict(), ttl=3600)  # 1 hour TTL
    await db.set_cache('news_data', data['news'], ttl=3600)
    await db.set_cache('emails_data', data['emails'], ttl=3600)

    logger.info("? All data preloaded and cached")

# src/ui/components/daily_brief.py
def render_daily_brief_section():
    """UI just reads from cache - no AI processing!"""

    # Fast read from database (milliseconds, not seconds)
    if 'data' in st.session_state and st.session_state.data.get('daily_brief'):
        brief_data = st.session_state.data['daily_brief']
        _display_persisted_brief(brief_data)  # Instant!
    else:
        st.info("Run `make preload` to generate your first brief.")

Why this architecture rocks:

  • ? UI loads in <500ms (reading from SQLite cache)
  • ? Background refresh (run make preload or schedule with cron)
  • ? Persistent (brief survives app restarts)
  • ? Testable (can test UI without LLM calls)

Part 1: Setting Up My Local AI Stack

First, I needed to get Ollama running locally. This took me about 30 minutes.

Installing Ollama

# On macOS (what I use)
brew install ollama

# Start the service
ollama serve

# Pull the models I chose
ollama pull qwen2.5:7b         # Main LLM - fast on my M3 Mac
ollama pull nomic-embed-text   # For RAG embeddings

Why I chose Qwen 2.5 (7B):

  • ? Runs fast on my M3 MacBook Pro (no GPU needed)
  • ? Good reasoning capabilities for summarization
  • ? Small enough to iterate quickly (responses in 2-3 seconds)
  • ? Free and private – data never leaves my laptop

Later, I can swap to GPT-4 or Claude with just a config change when I deploy to production.

Testing My Setup

I wanted to make sure Ollama was working before going further:

# Quick test
PYTHONPATH=. python -c "
import asyncio
from src.services.ollama_service import get_ollama_service

async def test():
    ollama = get_ollama_service()
    result = await ollama.generate('Explain RAG in one sentence.')
    print(result)

asyncio.run(test())
"

# Output I got:
# RAG (Retrieval-Augmented Generation) enhances LLM responses by retrieving
# relevant information from a knowledge base before generating answers.

? First milestone: Local AI working!


Part 2: Building MCP Connectors

Instead of hard coding data fetching like this:

# ? My first attempt (brittle)
async def get_daily_data():
    news = await fetch_hackernews()
    weather = await fetch_weather()
    # Later I wanted to add RSS feeds... had to modify this function
    # Then I wanted Slack... modified again
    # This was getting messy fast!

I decided to use MCP (Model Context Protocol) to register data sources as “tools” so that the AI can discover and call by name:

Building News Connector

I started with HackerNews since I check it every morning:

# src/connectors/hackernews.py
class HackerNewsConnector:
    """Fetches top stories from HackerNews API."""

    async def execute_async(self, max_stories: int = 10):
        """The main method MCP will call."""
        # 1. Fetch top story IDs
        response = await self.client.get(
            "https://hacker-news.firebaseio.com/v0/topstories.json"
        )
        story_ids = response.json()[:max_stories]

        # 2. Fetch each story (I fetch these in parallel for speed)
        articles = []
        for story_id in story_ids:
            story = await self._fetch_story(story_id)
            articles.append(self._convert_to_article(story))

        return articles

Key learning: Keep connectors simple. They should do ONE thing: fetch data and return it in a standard format.

Registering with MCP Server

Then I registered this connector with my MCP server:

# src/services/mcp_server.py
class MCPServer:
    """The tool registry that AI agents query."""

    def _register_tools(self):
        # Register HackerNews
        self.tools["fetch_hackernews"] = MCPTool(
            name="fetch_hackernews",
            description="Fetch top tech stories from HackerNews with scores and comments",
            parameters={
                "max_stories": {
                    "type": "integer",
                    "description": "How many stories to fetch (1-30)",
                    "default": 10
                }
            },
            executor=HackerNewsConnector()
        )

This allows my AI to discover this tool and call it without me writing any special integration code!

Testing MCP Discovery

# I tested if the AI could discover my tools
PYTHONPATH=. python -c "
from src.services.mcp_server import get_mcp_server

mcp = get_mcp_server()
print('Available tools:')
for tool in mcp.list_tools():
    print(f'  ? {tool[\"name\"]}: {tool[\"description\"]}')
"

# Output I got:
# Available tools:
#   ? fetch_hackernews: Fetch top tech stories from HackerNews...
#   ? get_current_weather: Get current weather conditions...
#   ? fetch_rss_feeds: Fetch articles from configured RSS feeds...

Later, when I wanted to add RSS feeds, I just created a new connector and registered it. The AI automatically discovered it – no changes needed to my ReAct agent or LangGraph workflows!


Part 3: Building RAG Pipeline

As LLM have limited context window, RAG (Retrieval-Augmented Generation) can be used to create an AI semantic memory by:

  1. Converting text to vectors (embeddings)
  2. Storing vectors in a database (ChromaDB)
  3. Searching by meaning, not just keywords

Building RAG Service

I then implemented RAG service as follows:

# src/services/rag_service.py
class RAGService:
    """Semantic memory using ChromaDB."""

    def __init__(self):
        # Initialize ChromaDB (stores on disk)
        self.client = chromadb.Client(Settings(
            persist_directory="./data/chroma_data"
        ))

        # Create collection for my articles
        self.collection = self.client.get_or_create_collection(
            name="daily_minutes"
        )

        # Ollama for creating embeddings
        self.ollama = get_ollama_service()

    async def add_document(self, content: str, metadata: dict):
        """Store a document with its vector embedding."""

        # 1. Convert text to vector (this is the magic!)
        embedding = await self.ollama.create_embeddings(content)

        # 2. Store in ChromaDB with metadata
        self.collection.add(
            documents=[content],
            embeddings=[embedding],
            metadatas=[metadata],
            ids=[hashlib.md5(content.encode()).hexdigest()]
        )

    async def search(self, query: str, max_results: int = 5):
        """Semantic search - find by meaning!"""

        # 1. Convert query to vector
        query_embedding = await self.ollama.create_embeddings(query)

        # 2. Find similar documents (cosine similarity)
        results = self.collection.query(
            query_embeddings=[query_embedding],
            n_results=max_results
        )

        return results

I then tested it:

# I stored an article about EU AI regulations
await rag.add_document(
    content="European Union announces comprehensive AI safety regulations "
            "focusing on transparency, accountability, and privacy protection.",
    metadata={"type": "article", "topic": "ai_safety"}
)

# Later, I searched using different words
results = await rag.search("privacy rules for artificial intelligence")

This shows that RAG isn’t just storing text – it understands meaning through vector mathematics.

What I Store in RAG

Over time, I started storing other data like emails, todos, events, etc:

# 1. News articles (for historical context)
await rag.add_article(article)

# 2. Action items from emails
await rag.add_todo(
    "Complete security training by Nov 15",
    source="email",
    priority="high"
)

# 3. Meeting context
await rag.add_document(
    "Q4 Planning Meeting - need to prepare budget estimates",
    metadata={"type": "meeting", "date": "2025-02-01"}
)

# 4. User preferences (this feeds into RLHF later!)
await rag.add_document(
    "User marked 'AI safety' topics as important",
    metadata={"type": "preference", "category": "ai_safety"}
)

With this AI memory, it can answer questions like:

  • “What do I need to prepare for tomorrow’s meeting?”
  • “What AI safety articles did I read this week?”
  • “What are my pending action items?”

Part 4: Building the ReAct Agent

In my early prototyping, the implementation just executed blindly:

# ? First attempt - no thinking!
async def generate_brief():
    news = await fetch_all_news()  # Fetches everything
    summary = await llm.generate(f"Summarize: {news}")
    return summary

This wasted time fetching data I didn’t need. I wanted my AI to reason first, then act so I applied ReAct (Reasoning + Acting), which works in a loop:

  1. THOUGHT: AI reasons about what to do next
  2. ACTION: AI executes a tool/function
  3. OBSERVATION: AI observes the result
  4. Repeat until goal achieved

Implementing My ReAct Agent

Here is how it ReAct agent was built:

# src/agents/react_agent.py
class ReActAgent:
    """Agent that thinks before acting."""

    async def run(self, goal: str):
        """Execute goal using ReAct loop."""
        steps = []
        observations = []

        for step_num in range(1, self.max_steps + 1):
            # 1. THOUGHT: Ask AI what to do next
            thought = await self._generate_thought(goal, steps, observations)

            # Check if we're done
            if "FINAL ANSWER" in thought:
                return self._extract_answer(thought)

            # 2. ACTION: Parse what action to take
            action = self._parse_action(thought)
            # Example: {"action": "call_tool", "tool": "fetch_hackernews"}

            # 3. EXECUTE: Run the action via MCP
            observation = await self._execute_action(action)
            observations.append(observation)

            # Record this step for debugging
            steps.append({
                "thought": thought,
                "action": action,
                "observation": observation
            })

        return {"steps": steps, "answer": "Max steps reached"}

The hardest part was writing the prompts that made the AI reason properly:

async def _generate_thought(self, goal, steps, observations):
    """Generate next reasoning step."""

    prompt = f"""Goal: {goal}

Previous steps:
{self._format_steps(steps)}

Available actions:
- query_rag(query): Search my semantic memory
- call_tool(name, params): Execute an MCP tool
- FINAL ANSWER: When you have everything needed

Think step-by-step. What should I do next?

Format your response as:
THOUGHT: <your reasoning>
ACTION: <action to take>
"""

    return await self.ollama.generate(prompt, temperature=0.7)

I added debug logging to see the AI’s reasoning:

? Goal: Generate my daily brief

Step 1:
  ? THOUGHT: I need to gather news, check weather, and see user preferences
  ? ACTION: call_tool("fetch_hackernews", max_stories=10)
  ?? OBSERVATION: Fetched 10 articles about AI, privacy, and tech

Step 2:
  ? THOUGHT: Got news. User preferences would help prioritize.
  ? ACTION: query_rag("user interests and preferences")
  ?? OBSERVATION: User cares about AI safety, security, privacy

Step 3:
  ? THOUGHT: Should filter articles to user's interests
  ? ACTION: call_tool("get_current_weather", location="Seattle")
  ?? OBSERVATION: 70°F, Partly cloudy

Step 4:
  ? THOUGHT: I have news (filtered by user interests), weather. Ready to generate.
  ? ACTION: FINAL ANSWER
  ? Generated personalized brief highlighting AI safety articles

Part 5: Adding RLHF

Initially, my AI scored all emails the same way:

? "Newsletter: 10 CSS Tips" ? Importance: 0.5
? "URGENT: Production outage!" ? Importance: 0.5

So I used RLHF to teach my AI what I care about.

Implementing RLHF Scoring

I added a mixin to my email model:

# src/models/email.py
class ImportanceScoringMixin:
    """Learn from user feedback."""

    importance_score: float = 0.5  # AI's base score
    boost_labels: Set[str] = set()  # Words user marked important
    filter_labels: Set[str] = set()  # Words user wants to skip

    def apply_rlhf_boost(self, content_text: str) -> float:
        """Adjust score based on learned preferences."""
        adjusted = self.importance_score
        content_lower = content_text.lower()

        # Boost if content matches important keywords
        for label in self.boost_labels:
            if label.lower() in content_lower:
                adjusted += 0.1  # Bump up priority!

        # Penalize if content matches skip keywords
        for label in self.filter_labels:
            if label.lower() in content_lower:
                adjusted -= 0.2  # Push down priority!

        # Keep in valid range [0, 1]
        return max(0.0, min(1.0, adjusted))

Note: Code examples are simplified for clarity.
See GitHub for the full production implementation.

Adding Feedback UI

In my Streamlit dashboard, I added ?/? buttons:

# User sees an email
for email in emails:
    col1, col2, col3 = st.columns([8, 1, 1])

    with col1:
        st.write(f"**{email.subject}**")
        st.info(email.snippet)

    with col2:
        if st.button("?", key=f"important_{email.id}"):
            # Extract what made this important
            keywords = await extract_keywords(email.subject + email.body)
            # Add to boost labels
            user_profile.boost_labels.update(keywords)
            st.success(f"? Learned: You care about {', '.join(keywords)}")

    with col3:
        if st.button("?", key=f"skip_{email.id}"):
            # Learn to deprioritize these
            keywords = await extract_keywords(email.subject)
            user_profile.filter_labels.update(keywords)
            st.success(f"? Will deprioritize: {', '.join(keywords)}")

Part 6: Orchestrating with LangGraph

Instead of fetching contents from all data sources sequential for the daily minutes:

# ? Sequential execution - SLOW!
news = await fetch_news()      # 5 seconds
emails = await fetch_emails()  # 3 seconds
calendar = await fetch_calendar()  # 2 seconds
weather = await fetch_weather()  # 1 second
# Total: 11 seconds just waiting! ?

I used LangGraph to define workflows as graphs with parallel execution:

Key insight: Parallel fetch reduced the fetch time for downloading data from various sources.

Building My Workflow

# src/services/langgraph_orchestrator.py
from langgraph.graph import StateGraph, END

class LangGraphOrchestrator:
    def _create_workflow(self):
        """Define my workflow graph."""
        workflow = StateGraph(WorkflowState)

        # Add nodes (processing steps)
        workflow.add_node("analyze", self._analyze_request)
        workflow.add_node("fetch_news", self._fetch_news)
        workflow.add_node("fetch_emails", self._fetch_emails)
        workflow.add_node("fetch_calendar", self._fetch_calendar)
        workflow.add_node("search_rag", self._search_context)
        workflow.add_node("generate_summary", self._generate_summary)

        # Define edges (execution flow)
        workflow.set_entry_point("analyze")

        # Parallel fetch (all happen at once!)
        workflow.add_edge("analyze", "fetch_news")
        workflow.add_edge("analyze", "fetch_emails")
        workflow.add_edge("analyze", "fetch_calendar")

        # All converge to RAG search
        workflow.add_edge("fetch_news", "search_rag")
        workflow.add_edge("fetch_emails", "search_rag")
        workflow.add_edge("fetch_calendar", "search_rag")

        # Sequential processing
        workflow.add_edge("search_rag", "generate_summary")
        workflow.add_edge("generate_summary", END)

        return workflow.compile()

Note: WorkflowState is a shared dictionary that nodes pass data through – like a clipboard for the workflow. The analyze node parses the user’s request and decides which data sources are needed.

Implementing Node Functions

Each node is just an async function:

async def _fetch_news(self, state: WorkflowState):
    """Fetch news in parallel."""
    try:
        articles = await self.mcp.execute_tool(
            "fetch_hackernews",
            {"max_stories": 10}
        )
        state["news_articles"] = articles
    except Exception as e:
        state["errors"].append(f"News fetch failed: {e}")
        state["news_articles"] = []

    return state

async def _search_context(self, state: WorkflowState):
    """Search RAG for relevant context."""
    query = state["user_request"]
    results = await self.rag.search(query, max_results=5)

    # Build context string
    context = "\n".join([r['content'] for r in results])
    state["context"] = context

    return state

Running the Workflow

# Execute the complete workflow
result = await orchestrator.run("Generate my daily brief")

# I get back:
{
    "news_articles": [...],      # 10 articles
    "emails": [...],              # 5 unread
    "calendar_events": [...],     # 3 events today
    "context": "...",             # RAG context
    "summary": "...",             # Generated brief
    "processing_time": 5.2        # Seconds (not 11!)
}

The LLM Factory Pattern – How I Made It Cloud-Ready

Following code snippet shows how does the system seamlessly switch between local Ollama and cloud providers:

# src/services/llm_factory.py
def get_llm_service():
    """Factory pattern - works with any LLM provider."""
    provider = os.getenv("LLM_PROVIDER", "ollama")

    if provider == "ollama":
        return OllamaService(
            base_url=os.getenv("OLLAMA_BASE_URL", "http://localhost:11434"),
            model=os.getenv("OLLAMA_MODEL", "qwen2.5:7b")
        )
    elif provider == "openai":
        return OpenAIService(
            api_key=os.getenv("OPENAI_API_KEY"),
            model=os.getenv("OPENAI_MODEL", "gpt-4-turbo")
        )
    elif provider == "google":
        # Like in my previous Vertex AI article!
        return VertexAIService(
            project_id=os.getenv("GCP_PROJECT_ID"),
            model="gemini-1.5-flash"
        )

    raise ValueError(f"Unknown provider: {provider}")

# All services implement the same interface:
class BaseLLMService:
    async def generate(self, prompt: str, **kwargs) -> str:
        """Generate text from prompt."""
        raise NotImplementedError

    async def create_embeddings(self, text: str) -> List[float]:
        """Create vector embeddings."""
        raise NotImplementedError

The ReAct agent, RAG service, and Brief Generator all use get_llm_service() – they don’t care which provider is running!


Part 7: The Challenges I Faced

Building this system wasn’t smooth. Here are the biggest challenges:

Challenge 1: LLM Generating Vague Summaries

Problem: My early briefs were terrible:

? "Today's news features a mix of technology updates and various topics."

This was useless! I needed specifics.

Solution: I rewrote my prompts with explicit rules:

# ? Better prompt with strict rules
prompt = f"""Generate a daily brief following these STRICT rules:

PRIORITY ORDER (most important first):
1. Urgent emails or action items
2. Today's calendar events
3. Market/business news
4. Tech news

TLDR FORMAT (exactly 3 bullets, be SPECIFIC):
* Bullet 1: Most urgent email/action (include WHO, WHAT, WHEN)
   Example: "Client escalation from Acme Corp affecting 50K users - response needed by 2pm"

* Bullet 2: Most important calendar event today (include TIME and WHAT TO PREPARE)
   Example: "2pm: Board meeting - prepare Q4 revenue slides"

* Bullet 3: Top market/business news (include NUMBERS/SPECIFICS)
   Example: "Federal Reserve raises rates 0.5% to 5.25% - affects tech hiring"

AVOID THESE PHRASES (they're too vague):
? "mix of updates"
? "various topics"
? "continues to make progress"
? "interesting developments"

USE SPECIFIC DETAILS:
? Names (people, companies)
? Numbers (percentages, dollar amounts, deadlines)
? Times (when something happened or needs to happen)

Content to summarize:
{content}

Generate: TLDR (3 bullets), Summary (5-6 detailed sentences), Key Insights (5 bullets)
"""

Result: Went from vague ? specific, actionable briefs!

Challenge 2: TLDR Bullets Rendering on Same Line

Problem: My UI showed bullets in one paragraph:

? • Critical email... • Meeting at 2pm... • Market news...

Root cause: Streamlit’s st.info() doesn’t preserve newlines.

Solution: Split and render each bullet separately:

# ? Doesn't work
st.info(tldr)

# ? Works!
tldr_lines = [line.strip() for line in tldr.split('\n') if line.strip()]
for bullet in tldr_lines:
    st.markdown(bullet)

Challenge 3: AI Prioritizing News Over Personal Tasks

Problem: My brief focused on tech news, ignored my urgent emails:

? TLDR bullet 1: "OpenAI releases GPT-5" (who cares?)
   TLDR bullet 2: "Crypto market surges" (not relevant to me)
   TLDR bullet 3: "Client escalation requires response" (BURIED!)

Solution: I restructured my prompt to explicitly label priority:

# src/services/brief_scheduler.py
async def _generate_daily_brief(emails, calendar, news, weather):
    """Generate prioritized daily brief with structured prompt."""

    # Separate market vs tech news (market is higher priority)
    market_news = [n for n in news if 'market' in n.tags]
    tech_news = [n for n in news if 'market' not in n.tags]

    # Sort emails by RLHF-boosted importance score
    important_emails = sorted(
        emails,
        key=lambda e: e.apply_rlhf_boost(e.subject + e.snippet),
        reverse=True
    )[:5]  # Top 5 only

    # Build structured prompt with clear priority
    prompt = f"""
**SECTION 1: IMPORTANT EMAILS (HIGHEST PRIORITY - use for TLDR bullet #1)**
{format_emails(important_emails)}

**SECTION 2: TODAY'S CALENDAR (SECOND PRIORITY - use for TLDR bullet #2)**
{format_calendar(calendar)}

**SECTION 3: MARKET NEWS (THIRD PRIORITY - use for TLDR bullet #3)**
{format_market_news(market_news)}

**SECTION 4: TECH NEWS (LOWEST PRIORITY - summarize briefly)**
{format_tech_news(tech_news)}

**SECTION 5: WEATHER**
{format_weather(weather)}

Generate a daily brief following this EXACT priority order:
1. Email action items FIRST
2. Calendar events SECOND
3. Market/business news THIRD
4. Tech news LAST (brief mention only)

TLDR must have EXACTLY 3 bullets using content from sections 1, 2, 3 (not section 4).
"""

    return await llm.generate(prompt)

Result: My urgent email moved to bullet #1 where it belongs! The AI now respects the priority structure.

Challenge 4: RAG Returning Irrelevant Results

Problem: Semantic search sometimes returned weird matches:

Query: "AI safety regulations"
Result: Article about "safe AI models for healthcare" (wrong context!)

Solution: I added metadata filtering and better embeddings:

# Store with rich metadata
await rag.add_document(
    content=article.title,
    metadata={
        "type": "article",
        "category": "ai_safety",  # For filtering!
        "tags": ["regulation", "eu", "policy"],
        "date": "2025-01-28",
        "importance": "high"
    }
)

# Search with filters
results = await rag.search(
    "AI regulations",
    filter_metadata={
        "category": "ai_safety",
        "importance": "high"
    }
)

Result: Much more relevant results!

Challenge 5: Handling API Failures

Problem: MCP connectors may fail to fetch data from underlying data source.

Solution: I used graceful degradation where brief is generated with available data and error messages/last-updated is marked for failed sources)


Part 8: Future Improvements

This work is by no means done but I am sharing a proof of concept that I have built so far. Here is what still needs work:

Current Limitations

1. Email/Calendar Improvements

  • ? Have: I have basic OAuth support for emails and calendar events and mock testing
  • ? Missing: Solid OAuth support for Gmail and Google Calendar integration

2. RLHF Needs More Sophisticated Learning

  • ? Have: Current system allows simple keyword matching (if email contains “security” ? boost)
  • ? Missing: Context-aware learning (distinguish “security update” vs “security breach”)
  • Improvement Needed:
  # Current: Simple keyword match
  if "security" in email:
      score += 0.1

  # Better: Contextual understanding
  if embeddings_similar(email, user.important_emails):
      score += contextual_boost  # Uses semantic similarity!

3. ReAct Agent Sometimes Over-Thinks

  • ? Have: AI reasons before acting
  • ? Problem: Sometimes takes 4-5 steps when 2 would suffice
  • Fix Needed: Better stopping criteria in prompts

4. No Multi-User Support (Yet)

  • ? Have: Works great for me
  • ? Missing: Can’t handle multiple users with different preferences
  • Future: Add user profiles, tenant isolation

5. Brief Generation Can Be Slow (30-60 seconds)

  • ? Have: Parallel data fetching (fast)
  • ? Bottleneck: LLM generation with Qwen 2.5 on CPU
  • Options:
  • Use smaller model (faster but less capable)
  • Deploy to cloud with faster GPT-4

6. Missing Data Sources

  • ? Have: Basic data for news, email and calendar
  • ? Slack Integration: Add Slack to monitor important channels and surface urgent threads in daily brief
  • ? Social Media Integration: Add social media feed to monitor trending topics or news

Part 9: How to Extend This System

I designed this to be easily extensible. Here’s how you can add new features:

Adding a New Data Source (Example: Slack)

Step 1: Create the connector

# src/connectors/slack.py
class SlackConnector:
    """Fetch recent messages from Slack channels."""

    async def execute_async(self, channel: str, max_messages: int = 10):
        # 1. Connect to Slack API
        client = WebClient(token=os.getenv("SLACK_BOT_TOKEN"))

        # 2. Fetch recent messages
        response = await client.conversations_history(
            channel=channel,
            limit=max_messages
        )

        # 3. Convert to standard format
        messages = []
        for msg in response['messages']:
            messages.append(SlackMessage(
                text=msg['text'],
                user=msg['user'],
                channel=channel,
                timestamp=msg['ts']
            ))

        return messages

Step 2: Register with MCP (automatic discovery!)

# src/services/mcp_server.py
def _register_tools(self):
    # ... existing tools ...

    # Add Slack
    self.tools["get_slack_messages"] = MCPTool(
        name="get_slack_messages",
        description="Fetch recent Slack messages from a channel",
        parameters={
            "channel": {"type": "string", "description": "Channel name"},
            "max_messages": {"type": "integer", "default": 10}
        },
        executor=SlackConnector()
    )

Step 3: AI automatically discovers it!

# Your ReAct agent will now see:
# "Available tools: fetch_hackernews, get_slack_messages, ..."
# No changes needed to ReAct logic!

Step 4: Update brief prompt to include Slack

# src/services/brief_scheduler.py
prompt = f"""
**IMPORTANT EMAILS**: {emails}
**CALENDAR**: {calendar}
**SLACK HIGHLIGHTS**: {slack_messages}  # New!
**NEWS**: {news}

Generate brief prioritizing: Email > Calendar > Slack > News
"""

Part 10: Local Development vs Cloud

One of my favorite aspects of this architecture: develop locally, deploy to cloud with 1 config change.

Development (What I Use Daily)

# .env.development
LLM_PROVIDER=ollama
LLM_OLLAMA_BASE_URL=http://localhost:11434
LLM_OLLAMA_MODEL=qwen2.5:7b
DATABASE_URL=sqlite:///./data/daily_minutes.db
REDIS_URL=redis://localhost:6379

Benefits I experience daily:

  • ? Free: Zero API costs (I iterate 50+ times/day)
  • ? Fast: No network latency, responses in 2-3 seconds
  • ? Private: My emails never touch the internet
  • ? Offline: Works on planes, cafes without WiFi

Trade-offs I accept:

  • ?? Slower than GPT-4
  • ?? Less capable reasoning (7B vs 175B+ parameters)
  • ?? Manual updates (pull new Ollama models myself)

Production

# .env.production
LLM_PROVIDER=openai  # Just change this line!
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4-turbo
DATABASE_URL=postgresql://...  # Scalable DB
REDIS_URL=redis://prod-cluster:6379  # Distributed cache

The magic: Same code, different LLM!

# src/services/llm_factory.py
def get_llm_service():
    """Factory pattern - works with any LLM."""
    provider = os.getenv("LLM_PROVIDER", "ollama")

    if provider == "ollama":
        return OllamaService()
    elif provider == "openai":
        return OpenAIService()
    elif provider == "anthropic":
        return ClaudeService()
    elif provider == "google":
        return VertexAIService()  # Like in my previous article!

    raise ValueError(f"Unknown provider: {provider}")

Part 11: Testing Everything

I used TDD extensively to build each feature so that it’s easy to debug if something is not working:

Unit Tests

# Test MCP tool registration
pytest tests/unit/test_mcp_server.py -v

# Test RAG semantic search
pytest tests/unit/test_rag_service.py -v

# Test ReAct reasoning
pytest tests/unit/test_react_agent.py -v

# Test RLHF scoring
pytest tests/unit/test_rlhf_scoring.py -v

# Run all unit tests
pytest tests/unit/ -v
# 516 passed in 45.23s ?

Integration Tests

Also, in some cases unit tests couldn’t fully validate so I wrote integration tests to test persistence logic with sqlite database or generating real analysis from news:

# tests/integration/test_brief_quality.py
async def test_tldr_has_three_bullets():
    """TLDR must have exactly 3 bullets."""
    brief = await db.get_cache('daily_brief_data')
    tldr = brief.get('tldr', '')

    bullets = [line for line in tldr.split('\n') if line.strip().startswith('•')]

    assert len(bullets) == 3, f"Expected 3 bullets, got {len(bullets)}"
    assert "email" in bullets[0].lower() or "urgent" in bullets[0].lower()
    assert "calendar" in bullets[1].lower() or "meeting" in bullets[1].lower()

async def test_no_generic_phrases():
    """Brief should not contain vague phrases."""
    brief = await db.get_cache('daily_brief_data')
    summary = brief.get('summary', '')

    bad_phrases = ["mix of updates", "various topics", "continues to"]
    for phrase in bad_phrases:
        assert phrase not in summary.lower(), f"Found generic phrase: {phrase}"

Manual Testing (My Daily Workflow)

# 1. Fetch data and generate brief
make preload

# Output I see:
# ? Fetching news from HackerNews... (10 articles)
# ? Fetching weather... (70°F, Sunny)
# ? Analyzing articles with AI... (15 articles)
# ? Generating daily brief... (Done in 18.3s)
# ? Brief saved to database

# 2. Launch UI
streamlit run src/ui/streamlit_app.py

# 3. Check brief quality
# - Is TLDR specific? (not vague)
# - Are priorities correct? (email > calendar > news)
# - Are action items extracted? (from emails)
# - Did RLHF work? (boosted my preferences)

Note: You can schedule preload via cron, e.g., I run it at 6am daily so that brief is ready when I wake up.


Conclusion

Building this Daily Minutes assistant changed how I start my day by giving me a personalized 3-minute brief highlighting what truly matters. Agentic AI excels at automating complex workflows that require judgment, not just execution. The ReAct agent reasons through prioritization. RAG provides contextual memory across weeks of interactions. RLHF learns from my feedback, getting smarter about what I care about. LangGraph orchestrates parallel execution across multiple data sources. These building blocks work together to handle decisions that traditionally needed human attention.

I’m sharing this as a proof of concept, not a finished product. The code works, saves me real time, and demonstrates these techniques effectively. But I’m still iterating. The OAuth integration and error handling needs improvements. The RLHF scoring could be more sophisticated. The ReAct agent sometimes overthinks simple tasks. I’m adding these improvements gradually, testing each change against my daily routine.

The real lesson? Start small, validate with real use, then scale with confidence. I used Claude Code to build this in spare time over a couple weeks. You can do the same—clone the repo, adapt it to your workflow, and see where agentic AI saves you time.

Try It Yourself

# Clone my repo
git clone https://github.com/bhatti/daily-minutes
cd daily-minutes

# Install dependencies
pip install -r requirements.txt

# Setup Ollama
ollama pull qwen2.5:7b
ollama pull nomic-embed-text

# Generate your first brief
make preload

# Launch dashboard
streamlit run src/ui/streamlit_app.py

Resources

No Comments

No comments yet.

RSS feed for comments on this post. TrackBack URL

Sorry, the comment form is closed at this time.

Powered by WordPress