Decoding Fedspeak With Machine Learning

18 min readJun 27, 2024

A Machine Learning Approach to Analyzing FOMC Statements and Predicting Federal Funds Rate Changes

Context for this Article

The contents of this article are drawn from a report I recently completed for my Machine Learning class (COM S 474) during my last semester at Iowa State University. It ended up getting the highest report grade in the entire class, which was pretty cool and showed me I might be onto something interesting.

So, in layman’s terms, what’s it actually about? Well, I tried to use machine learning to analyze and “predict” what the Federal Reserve might do with interest rates by analyzing their official statements and meeting minutes transcripts. Many financial firms and institutions are heavily impacted by the decisions made adjacent to these meetings and press releases by the Fed, and hence any data connected to them has the potential to be useful.

Abstract

Across this project, I addressed the challenge of predicting Federal Funds Rate changes based on textual analysis of two combined datasets: FOMC meeting statements and minutes. The statements are publicly released documents, and the minutes are detailed transcripts made available three weeks post-meeting. I applied natural language processing and machine learning techniques to analyze sentiment and linguistic features within these texts, which were then used to train predictive models.

The approach involved preprocessing the data through tokenization, stop-word removal, and vectorization to transform the text into a format suitable for model training. Feature engineering efforts included extracting TF-IDF scores and sentiment analysis metrics. Both linear regression and neural network models were developed and evaluated, with the neural network model undergoing multiple pseudo-randomly configured iterations for optimization.

The best-performing neural network model achieved a final validation accuracy of 61.90%, significantly above the baseline accuracy of random guessing, which would be around 33% given the three-class problem of rate changes (up, down, unchanged). These results demonstrate the potential of using machine learning to interpret and predict based on financial texts, though they also underscore the need for further enhancements. Future efforts might explore more complex linguistic features and additional economic indicators to improve predictive accuracy.

Introduction

The Federal Reserve (Fed) plays a crucial role in shaping the economic landscape of the United States through its monetary policies. These policies include critical decisions on interest rates and other measures such as quantitative easing or tightening, which significantly influence inflation, employment rates, and overall economic stability. As these topics become increasingly relevant in public discourse, particularly during periods of economic uncertainty, the ability to anticipate changes in Fed policies can offer substantial advantages to financial institutions, investors, and policymakers.

The Federal Open Market Committee (FOMC), a key component of the Fed, regularly convenes to discuss and set these policies. The outcomes of these meetings, along with the subsequent public statements and detailed minutes, are highly anticipated events that can sway market sentiments and economic forecasts overnight. These documents are not only a direct communication from the Fed regarding its current policy stance but also a reflection of the underlying sentiment — whether ‘hawkish’ (favoring higher interest rates to curb inflation) or ‘dovish’ (favoring lower interest rates to encourage borrowing and investment).

A notable feature of these communications is their use of “Fedspeak”, a unique jargon characterized by vague and technical language, which can obscure the Fed’s intentions to the public. Understanding Fedspeak is crucial for interpreting the subtle cues that may indicate future monetary policy actions. This term was popularized to describe the cryptic and cautious language style used by Federal Reserve officials when discussing economic outlooks, further complicating the analysis of these important documents (Wikipedia contributors, 2024) .

Given the manual nature of these policy settings and the nuanced language used in FOMC communications, there lies a potential to extract predictive insights using advanced analytical techniques. Machine learning offers tools to decipher complex patterns in text that may elude conventional analysis. This project draws inspiration from exploratory work such as the article “Sensing the Fed’s Direction with the Help of AI” by Morgan Stanley (2023), which discussed the potential of AI in interpreting Fed communications to predict policy shifts. Despite its high-level approach and lack of shared methodologies, it highlighted the burgeoning interest in leveraging technology to forecast economic indicators.

This problem is inherently suitable for machine learning. Given the dynamic and impactful nature of Fed decisions, predicting the direction of interest rate changes based on textual analysis of FOMC statements and minutes can be formulated as a classification problem. This approach focuses on determining the direction of the change (increase, decrease, or unchanged) rather than quantifying the change itself. The choice of classification over regression was made under the premise that the impact of directional changes often holds more significance for economic strategies and decisions than the exact magnitude of change as a numerical value.

In this report, I aim to delve into the potential application of machine learning, specifically natural language processing combined with creating a model, to explore whether it could predict shifts in the Federal Funds Rate from the textual content of FOMC statements and minutes. My aim is to bridge economic theory with practical insight. I detail the datasets used, the exploratory methodologies I applied, the experimental setups I tested, and the preliminary results I obtained, all within the context of an exploratory analysis on the use of machine learning in economic forecasting.

Related Work

Recent research has explored the potential of using natural language processing (NLP) techniques to analyze central bank communications and predict monetary policy decisions. Doh, Kim, and Yang (2021) apply NLP to analyze the tone and content of Federal Open Market Committee (FOMC) statements, highlighting the significant impact of qualitative descriptions of economic conditions on financial markets. They utilize the Universal Sentence Encoder (USE) to quantify the semantic similarity between official FOMC statements and alternative draft statements, demonstrating the importance of language in framing policy actions.

Meade and Acosta (2015) also analyze the semantic content of FOMC postmeeting statements using cosine similarity. They apply text preprocessing techniques and term frequency-inverse document frequency (TF-IDF) weighting to extract meaningful semantic information, revealing that the semantic content of the statements is less persistent than the surface-level language might indicate.

Kim, Spörer, and Handschuh (2023) employ advanced language modeling techniques such as VADER, FinBERT, and GPT-4 to analyze the linguistic aspects of FOMC communication. They find that the FOMC follows templates to cover economic situations and avoids expressing emotion in their sentences. The authors highlight the challenges of applying basic language models to the highly condensed text of FOMC minutes and propose “contextual matching” to better capture language nuances.

These studies collectively underscore the growing importance of NLP and machine learning techniques in analyzing central bank communications. They provide valuable insights into the potential of these methods in interpreting and predicting based on financial texts, as well as the need for continued research and development to enhance their accuracy and effectiveness. The findings and methodologies presented in these articles serve as an inspiration and foundation for this project, which aims to connect these statements with more practical insight by applying machine learning to predict potential shifts in the Federal Funds Rate based on the textual content of FOMC statements and minutes.

Data

The datasets used in this project were obtained from Kaggle, a popular platform for sharing and discovering datasets. Two specific datasets were utilized:

“FOMC Meeting Statements & Minutes” dataset by user vladstasca
“Effective Federal Funds Rate” dataset by user natashk

While web scraping the data directly from the Federal Reserve’s online resources would have been ideal, these pre-made datasets were chosen to streamline the data acquisition process and focus on the core objectives of the project.

The “FOMC Meeting Statements & Minutes” dataset contains the full text of the Federal Open Market Committee’s post-meeting statements and detailed minutes. These documents provide valuable insights into the FOMC’s economic assessments, policy decisions, and forward guidance. The dataset spans from 2000 to 2021, covering a significant period of modern U.S. monetary policy history.

The “Effective Federal Funds Rate” dataset provides daily data on the effective federal funds rate, which is the interest rate at which depository institutions lend reserve balances to other depository institutions overnight. This rate is a key benchmark for short-term interest rates and is closely watched by financial markets as an indicator of monetary policy stance. The dataset covers the period from 1954 to 2021, which allows overlapping with the aforementioned statement dataset.

To prepare the data for analysis, several preprocessing steps were performed. First, the two datasets were merged based on the meeting dates to align the textual data with the corresponding effective federal funds rate. The resulting merged dataset contained the following key features:

Date: The date of the FOMC meeting
Text: The full text of the FOMC statement or minutes
EFFR: The effective federal funds rate on the meeting date
Future_EFFR: The effective federal funds rate 90 days after the meeting date
Rate_Change: A categorical variable indicating whether the rate increased, decreased, or remained unchanged by the end of the following 90-day period

Feature engineering techniques were applied to extract meaningful information from the textual data. The text was preprocessed by converting it to lowercase, removing stop words (common words like “the” and “and”), and lemmatizing the remaining words to their base form. Two vectorization methods were used to convert the preprocessed text into numerical features:

Term Frequency-Inverse Document Frequency (TF-IDF): This method assigns weights to words based on their frequency in each document and their rarity across the entire corpus. It helps identify words that are more informative for distinguishing between documents.
Count Vectorization: This method simply counts the occurrence of each word in each document, creating a matrix of word frequencies.

In addition to the vectorized text features, sentiment analysis was performed using the TextBlob library. Two sentiment scores were calculated for each document:

Polarity: A measure of the positive or negative sentiment expressed in the text, ranging from -1 (very negative) to 1 (very positive).
Subjectivity: A measure of the degree of personal opinion or factual information in the text, ranging from 0 (very objective) to 1 (very subjective).

The resulting feature set consisted of the TF-IDF features, count features, polarity scores, and subjectivity scores. The target variable, Rate_Change, was one-hot encoded to represent the three possible outcomes: rate increase, rate decrease, or no change.

To visualize the relationships between the textual features and the effective federal funds rate, several plots were generated. Figure 1 shows the verbosity (measured by the length of the FOMC statements) and the effective federal funds rate over time. It reveals an interesting trend: as the federal funds rate decreased, particularly during periods of economic stress like the 2008 financial crisis, the FOMC statements tended to become longer and more detailed, likely reflecting the need for more extensive communication and guidance during uncertain times.

Figure 2 presents a scatterplot of the verbosity versus the effective federal funds rate, with the points colored according to the rate change category. This plot shows a general trend of varied changes to interest rates within ranges of average verbosity of FOMC statements at a given point in time

These visualizations provide initial insights into the potential connections between the FOMC’s communication style and the monetary policy decisions reflected in the federal funds rate. The preprocessed and engineered features, along with the encoded target variable, form the input for the machine learning models developed in this project to predict rate changes based on the textual content of FOMC statements and minutes.

Methodology

In this project, I employed a combination of natural language processing (NLP) techniques and machine learning models to predict changes in the Federal Funds Rate based on the textual content of FOMC meeting statements and minutes. The methodology involved several key steps, including data preprocessing, feature engineering, model selection, and hyperparameter tuning.

To establish a baseline for comparison, I first trained a linear regression model using the engineered features. The model was fit on the training data, and its accuracy was evaluated on the test set. This simple model served as a reference point to assess the performance of more complex models. Figure 3 displays the outcomes of predicting on the validation set using the this linear regression model.

The primary focus of this project was on developing neural network models for predicting rate changes. I implemented a sequential neural network architecture using the Keras library with TensorFlow as the backend. The architecture consisted of an input layer, two dense layers with dropout regularization, and an output layer with softmax activation. The dense layers used either ReLU or tanh activation functions, determined through hyperparameter tuning.

To find the optimal hyperparameters for the neural network model, I employed a pseudo-random search approach. I generated 30 unique model configurations by randomly selecting values for the number of units in the dense layers, activation functions, dropout rates, and learning rates for the Adam optimizer. Each configuration was trained and evaluated independently, with the best-performing model being selected based on the validation accuracy.

During the training process, the EarlyStopping callback from Keras was utilized in an attempt to prevent potential overfitting. The callback monitored the validation loss and stopped the training if no improvement was observed for a specified number of epochs (patience). This technique helped to avoid overfitting and select the best model based on its performance on the validation set.

The performance of the models was evaluated using accuracy as the primary metric. For the linear regression model, the predicted probabilities were converted to class labels by selecting the class with the highest probability. The accuracy was then calculated by comparing the predicted class labels with the true labels. For the neural network models, the validation accuracy was monitored during training, and the best model was selected based on the highest validation accuracy achieved.

To visualize the performance of the models, several plots were generated. The validation accuracies of all the neural network models were plotted to compare their performance and select the best model, which can be seen in Figure 4. Additionally, the training and validation accuracy curves for the best neural network model were plotted to observe the model’s learning progress and identify any signs of overfitting, which can be seen in Figure 5.

In summary, the methodology employed in this project involved a combination of NLP techniques for text preprocessing and feature engineering, followed by the development and evaluation of machine learning models, specifically linear regression and neural networks. The neural network models underwent hyperparameter tuning using a pseudo-random search approach, and the best model was selected based on its validation accuracy. The performance of the models was visualized through accuracy plots and compared to assess their effectiveness in predicting Federal Funds Rate changes based on FOMC meeting statements and minutes.

Results and Interpretation

The results of this project demonstrate the potential of using machine learning techniques to predict changes in the Federal Funds Rate based on the textual content of FOMC meeting statements and minutes. The best-performing neural network model achieved a validation accuracy of 61.90%, which is notably higher than the baseline accuracy of random guessing (roughly 33% for a three-class problem).

The model’s architecture consisted of an input layer, two dense layers with 64 and 128 units respectively, and an output layer with softmax activation. Dropout regularization was applied between the dense layers to prevent overfitting. The model used the ReLU activation function for the dense layers and was optimized using the Adam optimizer with a learning rate of 0.0001. The model was trained for a maximum of 100 epochs, with early stopping based on validation loss and a patience of 10 epochs.

The training and validation accuracy curves for the best model (Figure 3) show a slight improvement in validation accuracy over the course of training, which reaches its peak at epoch 12. The absence of a significant sudden divergence between the training and validation curves suggests that the model did not suffer from severe overfitting at this point.

However, it is important to note that while the best model outperformed random guessing and the linear regression baseline (which achieved an accuracy of approximately 52%), its accuracy of approximately 61.90% leaves room for improvement. This suggests that predicting Federal Funds Rate changes based solely on the textual content of FOMC statements and minutes is a challenging task, and additional features or more advanced modeling techniques may be necessary to achieve higher accuracy.

One potential limitation of the current approach is the reliance on a relatively small dataset, as the number of FOMC meetings and corresponding rate changes is limited. This constraint may hinder the model’s ability to learn more complex patterns and generalize well to unseen data. Additionally, the textual content of FOMC statements and minutes, while informative, may not capture all the relevant factors that influence rate change decisions, such as broader economic indicators and global events.

Despite these limitations, the results of this project highlight the potential of applying machine learning techniques to economic forecasting tasks. The combination of natural language processing and neural network modeling demonstrates the ability to extract meaningful insights from unstructured text data and use them to make predictions about future economic outcomes.

Future work in this area could explore several avenues for improvement, such as:

1. Additional features could be incorporated. Combining the textual features with quantitative economic indicators, such as inflation rates, unemployment rates, and GDP growth, may provide a more comprehensive representation of the factors influencing rate change decisions.

2. Experimenting with more advanced NLP techniques. Using more sophisticated techniques, such as word embeddings or transformer-based models (e.g., BERT), could help capture more nuanced semantic information from the FOMC statements and minutes.

3. Exploring alternative model architectures. Investigating other neural network architectures, such as recurrent neural networks (RNNs) or convolutional neural networks (CNNs), may be better suited for capturing temporal dependencies or local patterns in the textual data.

4. Expanding the dataset with other outside data sources. Incorporating additional sources of relevant text data, such as speeches by Federal Reserve officials or economic news articles, could add additional context for predicting rate changes and potentially improve model performance in a real-world environment.

Conclusion

In this project, I explored the application of machine learning techniques to predict changes in the Federal Funds Rate based on the textual content of FOMC meeting statements and minutes. By combining natural language processing and neural network modeling, I aimed to extract meaningful insights from these important economic documents and use them to forecast future monetary policy decisions.

The best-performing neural network model achieved a validation accuracy of 61.90%, demonstrating the potential of this approach to outperform random guessing and a linear regression baseline. This result highlights the ability of machine learning to uncover patterns and relationships in unstructured text data that can be leveraged for economic forecasting tasks.

However, the model’s accuracy also underscores the challenges associated with predicting complex economic outcomes based solely on textual information. The FOMC’s decision-making process is influenced by a wide range of factors, including quantitative economic indicators and global events, which may not be fully captured by the language used in their statements and minutes.

To further improve the predictive accuracy of such models, future research should focus on incorporating additional features, such as quantitative economic indicators, and exploring more advanced NLP techniques and model architectures. Expanding the dataset to include other relevant sources of text data, such as speeches by Federal Reserve officials or economic news articles, could also provide a richer context for predicting rate changes.

Moreover, the interpretability of machine learning models in the context of economic decision-making is an important consideration. While this project focused primarily on predictive accuracy, future work should also investigate methods for explaining and visualizing the factors that drive the model’s predictions.

In conclusion, I believe this project demonstrates the potential of machine learning to contribute to the field of economic forecasting, particularly in the context of predicting Federal Funds Rate changes based on the textual content of FOMC meeting statements and minutes. As machine learning techniques continue to advance, they are likely to play an increasingly important role in informing economic decision-making and shaping monetary policy.

References

Doh, T., Kim, S., & Yang, S.-K. (2021). How you say it matters: Text analysis of FOMC statements using natural language processing. Economic Review, 106(1), 25–40. https://www.kansascityfed.org/Economic%20Review/documents/7577/erv106n1dohkimyang.pdf

Kim, W., Spörer, J. F., & Handschuh, S. (2023). Analyzing FOMC Minutes: Accuracy and Constraints of Language Models. arXiv preprint arXiv:2304.10164. https://arxiv.org/abs/2304.10164

Meade, E. E., & Acosta, M. (2015, September 30). Hanging on every word: Semantic analysis of the FOMC’s post meeting statement. FEDS Notes. Board of Governors of the Federal Reserve System. https://doi.org/10.17016/2380-7172.1580

Morgan Stanley. (2023, April 25). Sensing the Fed’s direction with the help of AI. Morgan Stanley. https://www.morganstanley.com/articles/mnlpfeds-sentiment-index-federal-reserve

Wikipedia contributors. (2024, April 5). Fedspeak. Wikipedia. https://en.wikipedia.org/wiki/Fedspeak

Code

import pandas as pd
import kaggle
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics import classification_report, confusion_matrix
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from textblob import TextBlob
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
import numpy as np
import random

# Authenticate and download datasets
def download_datasets():
    kaggle.api.authenticate()
    kaggle.api.dataset_download_files('vladtasca/fomc-meeting-statements-and-minutes', path='.', unzip=True)
    kaggle.api.dataset_download_files('natashk/effective-federal-funds-rate', path='.', unzip=True)

# Load and prepare FOMC dataset
def load_prepare_fomc_dataset():
    fomc_transcripts_df = pd.read_csv('communications.csv')
    fomc_statements = fomc_transcripts_df[fomc_transcripts_df['Type'] == 'Statement'].copy()  # Create a copy

    # Modify the DataFrame using .loc[]
    fomc_statements.loc[:, 'Date'] = pd.to_datetime(fomc_statements['Date'])
    fomc_statements.set_index('Date', inplace=True)
    fomc_statements.loc[:, 'Text Length'] = fomc_statements['Text'].str.len()

    return fomc_statements

# Load and prepare Federal Funds Rate dataset
def load_prepare_fed_funds_dataset():
    fed_funds_df = pd.read_csv('FRB_H15.csv')
    fed_funds_df['Date'] = pd.to_datetime(fed_funds_df['Time Period'])
    fed_funds_df.set_index('Date', inplace=True)
    fed_funds_df = fed_funds_df[['RIFSPFF_N.D']].rename(columns={'RIFSPFF_N.D': 'EFFR'})
    fed_funds_df['Future_EFFR'] = fed_funds_df['EFFR'].shift(-90)
    return fed_funds_df

# Merge datasets
def merge_datasets(fomc_statements, fed_funds_df):
    merged = fomc_statements.join(fed_funds_df[['EFFR', 'Future_EFFR']], how='inner')
    merged['Rate_Change'] = np.select(
        [
            merged['Future_EFFR'] > merged['EFFR'],
            merged['Future_EFFR'] < merged['EFFR']
        ],
        [
            'Up',
            'Down'
        ],
        default='Unchanged'
    )
    return merged

# Preprocess text
def preprocess_text(text):
    stop_words = set(stopwords.words('english'))
    lemmatizer = WordNetLemmatizer()
    text = text.lower()
    words = [lemmatizer.lemmatize(word) for word in text.split() if word not in stop_words]
    return ' '.join(words)

# Feature Engineering with Text Data
def feature_engineering(merged):
    tfidf_vectorizer = TfidfVectorizer(max_features=10000)
    tfidf_features = tfidf_vectorizer.fit_transform(merged['Text'].apply(preprocess_text)).toarray()

    count_vectorizer = CountVectorizer(max_features=10000)
    count_features = count_vectorizer.fit_transform(merged['Text'].apply(preprocess_text)).toarray()

    merged['Polarity'] = merged['Text'].apply(lambda x: TextBlob(x).sentiment.polarity)
    merged['Subjectivity'] = merged['Text'].apply(lambda x: TextBlob(x).sentiment.subjectivity)

    X = np.concatenate((tfidf_features, count_features, merged[['Polarity', 'Subjectivity']].values), axis=1)
    y = pd.get_dummies(merged['Rate_Change'])
    return X, y

# Build Neural Network Model
def build_model(X_train, y_train, model_config):
    model = Sequential()
    model.add(Input(shape=(X_train.shape[1],)))
    model.add(Dense(model_config['dense_1_units'], activation=model_config['activation']))
    model.add(Dropout(model_config['dropout']))
    model.add(Dense(model_config['dense_2_units'], activation=model_config['activation']))
    model.add(Dense(y_train.shape[1], activation=model_config['output_activation']))

    model.compile(loss=model_config['loss'], optimizer=model_config['optimizer'], metrics=['accuracy'])
    return model

# Train the model
def train_model(model, X_train, y_train, X_test, y_test, model_config):
    early_stop = EarlyStopping(monitor='val_loss', patience=model_config['patience'], restore_best_weights=True)
    history = model.fit(X_train, y_train, epochs=model_config['epochs'], validation_data=(X_test, y_test), callbacks=[early_stop], verbose=1)
    return history

# Evaluate the model
def plot_verbosity_vs_ffr_over_time(merged):
    merged = merged.reset_index()  # Reset the index to include the 'Date' column
    merged['Year'] = pd.to_datetime(merged['Date']).dt.year  # Extract the year from the 'Date' column

    yearly_data = merged.groupby('Year')[['Text Length', 'EFFR']].mean()  # Group by year and calculate means

    fig, ax1 = plt.subplots(figsize=(10, 6))

    sns.lineplot(x=yearly_data.index, y='Text Length', data=yearly_data, ax=ax1, color='b', label='Verbosity')
    ax1.set_xlabel('Year')
    ax1.set_ylabel('Average Text Length', color='b')
    ax1.tick_params('y', colors='b')
    ax1.legend().set_visible(False)

    ax2 = ax1.twinx()
    sns.lineplot(x=yearly_data.index, y='EFFR', data=yearly_data, ax=ax2, color='r', label='Effective Federal Funds Rate')
    ax2.set_ylabel('Average Effective Federal Funds Rate', color='r')
    ax2.tick_params('y', colors='r')
    ax2.legend().set_visible(False)

    # Collect the handles and labels from both axes after disabling their automatic legends
    lines1, labels1 = ax1.get_legend_handles_labels()
    lines2, labels2 = ax2.get_legend_handles_labels()
    lines = lines1 + lines2
    labels = labels1 + labels2

    # Manually create and set the combined legend
    ax1.legend(lines, labels, loc='upper left')

    plt.title('Verbosity vs. Effective Federal Funds Rate Over Time', fontsize=16)
    plt.tight_layout()
    plt.savefig('verbosity_vs_ffr_over_time.png', dpi=300, bbox_inches='tight')

# Plot verbosity vs. rates
def plot_verbosity_vs_rates(merged):
    plt.figure(figsize=(10, 6))
    sns.scatterplot(x='Text Length', y='EFFR', data=merged, hue='Rate_Change', palette='colorblind', s=100)
    plt.title('Verbosity vs. 3 Month Rate Change', fontsize=16)
    plt.xlabel('Text Length', fontsize=14)
    plt.ylabel('Effective Federal Funds Rate', fontsize=14)
    # Set legend title
    plt.legend(title='3 Month Rate Change')
    plt.savefig('verbosity_vs_rates.png', dpi=300, bbox_inches='tight')

# Given a Linear Regression model that has been fit, we want to display some of the results of the model
def plot_linear_reg_example(lin_reg, X_test, y_test):
    y_pred = lin_reg.predict(X_test)
    y_pred_classes = np.argmax(y_pred, axis=1)
    y_true = np.argmax(y_test.values, axis=1)

    plt.figure(figsize=(10, 6))
    sns.lineplot(x=y_pred_classes, y=y_true, marker='o')
    plt.title('Linear Regression Model: Prediction vs. Actual', fontsize=16)
    plt.xlabel('Predicted Rate Change', fontsize=14)
    plt.ylabel('Actual Rate Change', fontsize=14)
    plt.savefig('linear_reg_example.png', dpi=300, bbox_inches='tight')

# Plotting function for all model accuracies
def plot_all_accuracies(accuracies):
    plt.figure(figsize=(10, 6))
    for model_idx, acc_history in enumerate(accuracies):
        sns.lineplot(acc_history, label=f'Model {model_idx+1}')
    plt.title('Validation Accuracy Across Models', fontsize=16)
    plt.xlabel('Epochs', fontsize=14)
    plt.ylabel('Accuracy', fontsize=14)
    plt.legend(loc='lower right')
    plt.savefig('all_model_accuracies.png', dpi=300, bbox_inches='tight')

# Plot training and validation accuracy for the best model
def plot_best_model_accuracy(history):
    plt.figure(figsize=(10, 6))
    sns.lineplot(history.history['accuracy'], label='Training Accuracy')
    sns.lineplot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.title('Training and Validation Accuracy for Best Model', fontsize=16)
    plt.xlabel('Epochs', fontsize=14)
    plt.ylabel('Accuracy', fontsize=14)
    plt.legend(loc='lower right')
    plt.savefig('best_model_accuracy.png', dpi=300, bbox_inches='tight')

def generate_model_report(best_model, best_history, best_val_acc):
    print("Model Report")
    print("============")
    print("Best Model Summary:")
    # Print model architecture
    print(best_model.summary())

    # Display best validation accuracy
    print("\nBest Validation Accuracy: {:.2f}%".format(best_val_acc * 100))

    # Get the epoch number where the best validation accuracy was achieved
    best_epoch = best_history.history['val_accuracy'].index(best_val_acc) + 1
    print("Achieved at epoch:", best_epoch)

    # Display final training and validation accuracies
    final_train_acc = best_history.history['accuracy'][-1]
    final_val_acc = best_history.history['val_accuracy'][-1]
    print("\nFinal Training Accuracy: {:.2f}%".format(final_train_acc * 100))
    print("Final Validation Accuracy: {:.2f}%".format(final_val_acc * 100))

    # Optionally, summarize other metrics like loss
    final_train_loss = best_history.history['loss'][-1]
    final_val_loss = best_history.history['val_loss'][-1]
    print("\nFinal Training Loss: {:.4f}".format(final_train_loss))
    print("Final Validation Loss: {:.4f}".format(final_val_loss))

def main():
    download_datasets()
    fomc_statements = load_prepare_fomc_dataset()
    fed_funds_df = load_prepare_fed_funds_dataset()
    merged = merge_datasets(fomc_statements, fed_funds_df)
    plot_verbosity_vs_rates(merged)  # Plot verbosity vs. rates
    plot_verbosity_vs_ffr_over_time(merged)  # Plot verbosity vs. FFR over time
    X, y = feature_engineering(merged)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    # Before testing a Neural Net, test a simpler model: regression

    # Train a Linear Regression model
    lin_reg = LinearRegression()
    lin_reg.fit(X_train, y_train)
    y_pred = lin_reg.predict(X_test)
   
    # Metric that we can use to compare with the neural network
    y_pred_classes = np.argmax(y_pred, axis=1)
    y_true = np.argmax(y_test.values, axis=1)
    print("Accuracy for Linear Regression model:", accuracy_score(y_true, y_pred_classes))

    # Plot a few example data points of prediction vs. actual
    plot_linear_reg_example(lin_reg, X_test, y_test)

    # Generate 100 unique model configurations, pseudo-random hyperparameter selection
    model_configs = []
    for _ in range(30):
        model_config = {
            'dense_1_units': random.choice([64, 128, 256]),
            'dense_2_units': random.choice([32, 64, 128]),
            'activation': random.choice(['relu', 'tanh']),
            'output_activation': 'softmax',
            'dropout': random.uniform(0.3, 0.7),
            'loss': 'categorical_crossentropy',
            'optimizer': Adam(learning_rate=random.uniform(0.0001, 0.001)),
            'epochs': 100,
            'patience': 10
        }
        if model_config not in model_configs:  # Ensure uniqueness
            model_configs.append(model_config)

   
    accuracies = []
    best_model = None
    best_history = None
    best_val_acc = 0

    for config in model_configs:
        model = build_model(X_train, y_train, config)
        history = train_model(model, X_train, y_train, X_test, y_test, config)
        accuracies.append(history.history['val_accuracy'])

        # Keep track of the best model
        val_acc = max(history.history['val_accuracy'])
        if val_acc > best_val_acc:
            best_val_acc = val_acc
            best_model = model
            best_history = history

    plot_all_accuracies(accuracies)  # Plot all model accuracies
    plot_best_model_accuracy(best_history)  # Plot training and validation accuracy for the best model

    generate_model_report(best_model, best_history, best_val_acc)

if __name__ == '__main__':
    main()