diff --git a/multimodal-understanding/code-patterns/01-NovaPremier-code-generation.ipynb b/multimodal-understanding/code-patterns/01-NovaPremier-code-generation.ipynb new file mode 100644 index 00000000..fadcce00 --- /dev/null +++ b/multimodal-understanding/code-patterns/01-NovaPremier-code-generation.ipynb @@ -0,0 +1,880 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "582923a9-6333-4b08-a243-98cb5d6211e9", + "metadata": {}, + "source": [ + "# Code development with Amazon Nova Premier on Amazon Bedrock" + ] + }, + { + "cell_type": "markdown", + "id": "0059fd64-314f-48b1-85b9-15aeeae2b187", + "metadata": {}, + "source": [ + "[Amazon Nova Premier](https://aws.amazon.com/ai/generative-ai/nova/understanding/) is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance. Build and scale generative AI applications with Amazon Nova foundation models with seamless integration in Amazon Bedrock. " + ] + }, + { + "cell_type": "markdown", + "id": "6d8e98cd-110a-415e-ac8f-a94b3fad89e8", + "metadata": {}, + "source": [ + "## Getting Started with Amazon Nova Premier on Bedrock\n", + "\n", + "[Amazon Nova Premier](https://www.amazon.science/publications/amazon-nova-premier-technical-report-and-model-card) is available available as a fully managed, serverless option in Amazon Bedrock. Let's walk through the process of enabling Amazon Nova Premier and setting up your development environment.\n", + "\n", + "### Prerequisites\n", + "\n", + "Before we begin, ensure you have:\n", + "\n", + "1. An AWS account (if you don't have one create it [here](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-creating.html)).\n", + "2. Understanding of [Amazon Bedrock](https://aws.amazon.com/bedrock/), [Amazon Sagemaker](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-set-up.html#gs-account) (optional) and [AWS Identity and Access Management (IAM)](https://aws.amazon.com/iam/).\n", + "3. Appropriate IAM permissions for Amazon Bedrock.\n", + "4. [Model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) enabled for required Amazon Bedrock." + ] + }, + { + "cell_type": "markdown", + "id": "e9500c9c-5559-4c9a-ac1b-3f4ee2bf594f", + "metadata": {}, + "source": [ + "### Enabling Amazon Nova Premier in Bedrock (Model Access)\n", + "\n", + "1. **Sign in to the AWS Management Console**:\n", + " Navigate to the Amazon Bedrock [console](https://console.aws.amazon.com/bedrock/).\n", + "\n", + "2. **Request model access**:\n", + " In the Bedrock console, select \"Model access\" from the left navigation menu, then click \"Manage model access\".\n", + "\n", + "![Bedrock Model Access](img/nova_1.png \"Request model access on Bedrock\")\n", + "\n", + "3. **Enable Amazon Nova Premier**:\n", + " Click on \"Modify model access\" then find Amazon Nova Premier in the list of available models, select the checkbox, and click \"Request model access\". Your access request will typically be approved within minutes.\n", + "\n", + "![Modify Model Access](img/nova_2.png \"Click on Modify model access\")\n", + "\n", + "4. **Verify access**:\n", + " Once approved, you'll see Amazon Nova Premier listed as \"Access granted\" in the Model Access page.\n", + "\n", + "![Access Granted](img/nova_3.png \"Access Granted\")\n", + "\n", + "5. **Set up your development environment**:\n", + " You'll need the AWS SDK for Python (Boto3) to interact with Bedrock. Install it using pip:\n", + "\n", + " ```python\n", + " pip install boto3\n", + " ```\n", + "\n", + "Now you're ready to utilize the powerful capabilities of Amazon Nova Premier!" + ] + }, + { + "cell_type": "markdown", + "id": "5142ca98-d1ad-4ed8-84fa-6fd393f8d3ea", + "metadata": {}, + "source": [ + "## Set up the environment\n", + "\n", + "Before we begin you must install the required libraries, set up your credentials to use Amazon Bedrock and define a helper function to call the service." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "53bf7434", + "metadata": {}, + "outputs": [], + "source": [ + "# Uncomment to install AWS SDK for Python (boto3)\n", + "#!pip install boto3 -U" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4b0ceb89", + "metadata": {}, + "outputs": [], + "source": [ + "import boto3\n", + "\n", + "# Set up the Bedrock client\n", + "bedrock_runtime = boto3.client(service_name=\"bedrock-runtime\")" + ] + }, + { + "cell_type": "markdown", + "id": "453aaa32-765c-45cc-9fef-4e5fba8c7f94", + "metadata": {}, + "source": [ + "Following helper method will handle Bedrock invocations using [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-examples.html).\n", + "\n", + "This method will have two distinct invocations, simple one with regular parameters and another one to include tool_config, for [Tool Use](https://docs.aws.amazon.com/bedrock/latest/userguide/tool-use.html) feature. Tool use with models is also known as Function calling." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7f97ec0a-8759-4ff0-8638-f7e43854f409", + "metadata": {}, + "outputs": [], + "source": [ + "# Helper Function to call Amazon Bedrock\n", + "def bedrock_run(input_msg,\n", + " system_msg=None,\n", + " tool_config=None,\n", + " model_id = \"us.amazon.nova-premier-v1:0\"):\n", + "\n", + " print(f\"Invoking model {model_id} on Bedrock\")\n", + "\n", + " # Base Prompts\n", + " system_prompts = []\n", + "\n", + " # Filling values if exists\n", + " if system_msg:\n", + " system_prompts.append({\"text\": system_msg})\n", + "\n", + " # Set the array of messages to be sent to the model\n", + " messages = [{\n", + " \"role\": \"user\",\n", + " \"content\": input_msg\n", + " }]\n", + "\n", + " # Placeholder for the results of the model\n", + " predicted = \"\"\n", + "\n", + " # Check if it is a tool calling ask\n", + " if tool_config:\n", + " inference_config = {\n", + " \"temperature\": 1, \n", + " \"topP\": 1\n", + " }\n", + " \n", + " additional_model_request_fields={\n", + " \"inferenceConfig\": {\n", + " \"topK\": 1,\n", + " },\n", + " }\n", + " \n", + " # Call Amazon Bedrock Inference\n", + " response = bedrock_runtime.converse(\n", + " modelId=model_id,\n", + " system=system_prompts,\n", + " messages=messages,\n", + " inferenceConfig=inference_config,\n", + " toolConfig=tool_config,\n", + " additionalModelRequestFields=additional_model_request_fields\n", + " )\n", + " \n", + " # Parse the response returned from Bedrock\n", + " tool_requests = response['output']['message']['content']\n", + " for tool_request in tool_requests:\n", + " if 'toolUse' in tool_request:\n", + " tool = tool_request['toolUse']\n", + " predicted = tool\n", + " else:\n", + " # Base inference parameters to use.\n", + " inference_config = {\"temperature\": 0.2}\n", + " \n", + " # Call Amazon Bedrock Inference\n", + " response = bedrock_runtime.converse(\n", + " modelId=model_id,\n", + " system=system_prompts,\n", + " messages=messages,\n", + " inferenceConfig=inference_config\n", + " )\n", + "\n", + " # Parse the response returned from Bedrock\n", + " output_message = response['output']['message']\n", + " for content in output_message['content']:\n", + " answer = content['text']\n", + " predicted += answer\n", + "\n", + " # Returns the result of the prediction\n", + " return predicted" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2f9f8b7b-995a-44f8-aca9-cda668e4e750", + "metadata": {}, + "outputs": [], + "source": [ + "# Test the solution to check if your set up is working\n", + "\n", + "input_msg = [{\"text\": f\"\"\"When AWS was founded?\"\"\"}]\n", + "bedrock_run(input_msg)" + ] + }, + { + "cell_type": "markdown", + "id": "57c721d6", + "metadata": {}, + "source": [ + "Now that we now that our environment setup is working we can start using Nova Premier to help us develop applications. Is this notebook we will guide you through the following tasks:\n", + "\n", + "1. **Code generation**: Nova Premier excels at generating high-quality code across a wide range of programming languages, including Java, Python, and more. It can translate natural language instructions into functional code, streamlining the development process and accelerating project timelines.\n", + "\n", + "2. **Code debugging**: Leveraging its understanding of code patterns and common pitfalls, Nova Premier can identify bugs, security vulnerabilities, and inefficiencies in code. It suggests fixes and improvements with detailed explanations of the issues found.\n", + "\n", + "3. **Code explanation**: The model can provide detailed explanations of existing code, breaking down complex functions and algorithms into understandable components. This feature is invaluable for knowledge transfer, onboarding new team members, and working with legacy codebases.\n", + "\n", + "4. **Code execution**: Nova Premier includes the capability to simulate code execution, predicting outputs and potential errors without actually running the code. It can generate test cases and validate functionality across multiple scenarios.\n", + "\n", + "5. **Code refactoring**: The model can restructure existing code to improve readability, performance, and maintainability without changing its external behavior, following best practices and design patterns appropriate for each language." + ] + }, + { + "cell_type": "markdown", + "id": "e8f6b8c3-00c4-4128-9dd3-b7332263e8be", + "metadata": {}, + "source": [ + "### 1 - Code Generation\n", + "\n", + "In our use case we will use Amazon Nova Premier to help us develop a machine learning model that can perform time series forecast." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "19c7e1f9-0a35-448f-8272-f158c56cffe2", + "metadata": {}, + "outputs": [], + "source": [ + "# Helper function to define a template for our coding generation capabilities\n", + "def generate_prompt(description):\n", + " prompt = f\"\"\"\n", + "You are an expert Python developer specializing in time series forecasting and AWS services. \n", + "Generate clean, efficient, and well-documented code based on the following requirements:\n", + "\n", + "{description}\n", + "\n", + "It must return only the Python code without any additional text before or after.\n", + "\"\"\"\n", + " return prompt" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e13fd43f-923f-4e54-991a-7d5753da18cd", + "metadata": {}, + "outputs": [], + "source": [ + "# Set up our instructions (prompt) to the model\n", + "\n", + "description = \"\"\"Create a complete Python for the following tasks, every step should be a function inside this Python file:\n", + "1. Generates synthetic daily sales data for a retail store with realistic seasonality and trend over 3 years\n", + "2. Splits the data into training and testing sets\n", + "3. Trains a time series forecasting model XGBoost\n", + "4. Evaluates the model using appropriate metrics (RMSE, MAPE)\n", + "5. Serializes the model and saves it locally\n", + "6. Create a function to handle model invocations\n", + "7. Run a function call to invoke model\n", + "\n", + "Make sure to include all necessary imports and explain key components with comments.\n", + "\"\"\"\n", + "\n", + "result = generate_prompt(description)\n", + "print(result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71e05540-a17a-423d-b9fc-bf3bef0cc23a", + "metadata": {}, + "outputs": [], + "source": [ + "# Now call Amazon Nova Premier to get the code\n", + "input_msg = [{\"text\": result}]\n", + "response = bedrock_run(input_msg)\n", + "\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "id": "66184672", + "metadata": {}, + "source": [ + "#### Test the generated code\n", + "\n", + "**Note: We're not using the \"main\" function generated by the model because we're going step by step over the next cells. Let's assume next snipped of code was generated by the model, copied and pasted on next cell.**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "700efbb8-df05-42d1-a073-84f6336b3a10", + "metadata": {}, + "outputs": [], + "source": [ + "# You'll probably need to install the xgboost package in you environment, to do so uncomment the next line\n", + "# #!pip install xgboost\n", + "\n", + "import numpy as np\n", + "import pandas as pd\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn.metrics import mean_squared_error, mean_absolute_percentage_error\n", + "import xgboost as xgb\n", + "import pickle\n", + "\n", + "# Generate synthetic daily sales data\n", + "def generate_synthetic_data():\n", + " np.random.seed(42)\n", + " date_rng = pd.date_range(start='1/1/2018', end='1/1/2021', freq='D')\n", + " trend = np.linspace(0, 1, len(date_rng))\n", + " seasonality = np.sin(2 * np.pi * (date_rng.dayofyear / 365.25))\n", + " noise = np.random.normal(0, 0.1, len(date_rng))\n", + " sales = 100 + 50 * trend + 30 * seasonality + noise\n", + " data = pd.DataFrame(date_rng, columns=['date'])\n", + " data['sales'] = sales\n", + " return data\n", + "\n", + "# Split data into training and testing sets\n", + "def split_data(data):\n", + " train, test = train_test_split(data, test_size=0.2, shuffle=False)\n", + " return train, test\n", + "\n", + "# Train XGBoost model\n", + "def train_model(train):\n", + " X_train = train.index.values.reshape(-1, 1)\n", + " y_train = train['sales']\n", + " model = xgb.XGBRegressor(objective='reg:squarederror', n_estimators=100)\n", + " model.fit(X_train, y_train)\n", + " return model\n", + "\n", + "# Evaluate the model\n", + "def evaluate_model(model, test):\n", + " X_test = test.index.values.reshape(-1, 1)\n", + " y_test = test['sales']\n", + " predictions = model.predict(X_test)\n", + " rmse = np.sqrt(mean_squared_error(y_test, predictions))\n", + " mape = mean_absolute_percentage_error(y_test, predictions)\n", + " return rmse, mape\n", + "\n", + "# Serialize and save the model\n", + "def save_model(model, filename='xgboost_model.pkl'):\n", + " with open(filename, 'wb') as file:\n", + " pickle.dump(model, file)\n", + "\n", + "# Function to handle model invocations\n", + "def invoke_model(model_path, input_date):\n", + " with open(model_path, 'rb') as file:\n", + " model = pickle.load(file)\n", + " date_ordinal = pd.to_datetime(input_date).toordinal()\n", + " prediction = model.predict(np.array([[date_ordinal]]))\n", + " return prediction[0]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5830d489-b6ef-4a19-8746-f4065c3022b9", + "metadata": {}, + "outputs": [], + "source": [ + "# Test the generate synthetic data function\n", + "data = generate_synthetic_data()\n", + "\n", + "data.set_index('date', inplace=True)\n", + "data.head(5)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe0a030d-9034-4895-bc65-2c9e4e6d0866", + "metadata": {}, + "outputs": [], + "source": [ + "# Test the split data function\n", + "train_data, test_data = split_data(data)\n", + "print(f\"Training dataset size: {len(train_data)} and Test dataset size: {len(test_data)}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6ba3a699-3ae4-4326-a489-ac57dad414e4", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# Test the training function\n", + "model = train_model(train_data)" + ] + }, + { + "cell_type": "markdown", + "id": "dd137427", + "metadata": {}, + "source": [ + "**It seems we have found a bug at the generated code, let's also use Amazon Nova Premier to fix it.**" + ] + }, + { + "cell_type": "markdown", + "id": "e2d94ba0", + "metadata": {}, + "source": [ + "### 2 - Code Debugging\n", + "\n", + "We can see that the training funcion is throwing an error when trying to execute it. Let\\'s try to ask model how to fix it by copying the error returned and passing it to the LLM model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7a0a1497-f0ae-4ccd-8298-421eeed43443", + "metadata": {}, + "outputs": [], + "source": [ + "# Copied error outputed\n", + "train_error = \"\"\"\n", + "XGBoostError: [18:16:06] /Users/runner/work/xgboost/xgboost/src/c_api/../data/array_interface.h:145: Check failed: typestr.size() == 3 || typestr.size() == 4: `typestr' should be of format .\n", + "Stack trace:\n", + " [bt] (0) 1 libxgboost.dylib 0x0000000172f8dbfc dmlc::LogMessageFatal::~LogMessageFatal() + 124\n", + " [bt] (1) 2 libxgboost.dylib 0x0000000172f9c79c xgboost::ArrayInterfaceHandler::Validate(std::__1::map, std::__1::allocator>, xgboost::Json, std::__1::less, std::__1::allocator, std::__1::allocator> const, xgboost::Json>>> const&) + 1120\n", + " [bt] (2) 3 libxgboost.dylib 0x0000000172fa1048 xgboost::ArrayInterface<2, false>::Initialize(std::__1::map, std::__1::allocator>, xgboost::Json, std::__1::less, std::__1::allocator, std::__1::allocator> const, xgboost::Json>>> const&) + 48\n", + " [bt] (3) 4 libxgboost.dylib 0x0000000173148228 xgboost::data::ArrayAdapter::ArrayAdapter(xgboost::StringView) + 148\n", + " [bt] (4) 5 libxgboost.dylib 0x0000000173147e3c xgboost::data::DMatrixProxy::SetArrayData(xgboost::StringView) + 72\n", + " [bt] (5) 6 libxgboost.dylib 0x0000000172f9a6fc XGProxyDMatrixSetDataDense + 136\n", + " [bt] (6) 7 libffi.8.dylib 0x0000000105bbc04c ffi_call_SYSV + 76\n", + " [bt] (7) 8 libffi.8.dylib 0x0000000105bb9834 ffi_call_int + 1404\n", + " [bt] (8) 9 _ctypes.cpython-312-darwin.so 0x0000000105b9c0f0 _ctypes_callproc + 756\n", + "\"\"\"\n", + "\n", + "input_msg = [{\"text\": f\"\"\"Why this {response} is returning this {train_error}\"\"\"}]\n", + "debug_response = bedrock_run(input_msg)\n", + "print(debug_response)" + ] + }, + { + "cell_type": "markdown", + "id": "189ac367-1e3c-46e2-838c-56abbaefdc93", + "metadata": {}, + "source": [ + "#### Test the fixed version of the code\n", + "Let's execute again, this time with brand new code fixed by Amazon Nova Premier" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7b5b5666-1523-4771-8da7-b0065a1857d9", + "metadata": {}, + "outputs": [], + "source": [ + "# Set up of the code generated by Amazon Nova Premier\n", + "\n", + "import numpy as np\n", + "import pandas as pd\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn.metrics import mean_squared_error, mean_absolute_percentage_error\n", + "import xgboost as xgb\n", + "import pickle\n", + "\n", + "# Generate synthetic daily sales data\n", + "def generate_synthetic_data():\n", + " np.random.seed(42)\n", + " date_rng = pd.date_range(start='1/1/2018', end='1/1/2021', freq='D')\n", + " trend = np.linspace(0, 1, len(date_rng))\n", + " seasonality = np.sin(2 * np.pi * (date_rng.dayofyear / 365.25))\n", + " noise = np.random.normal(0, 0.1, len(date_rng))\n", + " sales = 100 + 50 * trend + 30 * seasonality + noise\n", + " data = pd.DataFrame(date_rng, columns=['date'])\n", + " data['sales'] = sales\n", + " return data\n", + "\n", + "# Split data into training and testing sets\n", + "def split_data(data):\n", + " train, test = train_test_split(data, test_size=0.2, shuffle=False)\n", + " return train, test\n", + "\n", + "# Train XGBoost model\n", + "def train_model(train):\n", + " # Convert dates to ordinal values\n", + " X_train = train['date'].apply(lambda x: x.toordinal()).values.reshape(-1, 1)\n", + " y_train = train['sales']\n", + " model = xgb.XGBRegressor(objective='reg:squarederror', n_estimators=100)\n", + " model.fit(X_train, y_train)\n", + " return model\n", + "\n", + "# Evaluate the model\n", + "def evaluate_model(model, test):\n", + " # Convert dates to ordinal values\n", + " X_test = test['date'].apply(lambda x: x.toordinal()).values.reshape(-1, 1)\n", + " y_test = test['sales']\n", + " predictions = model.predict(X_test)\n", + " rmse = np.sqrt(mean_squared_error(y_test, predictions))\n", + " mape = mean_absolute_percentage_error(y_test, predictions)\n", + " return rmse, mape\n", + "\n", + "# Serialize and save the model\n", + "def save_model(model, filename='xgboost_model.pkl'):\n", + " with open(filename, 'wb') as file:\n", + " pickle.dump(model, file)\n", + "\n", + "# Function to handle model invocations\n", + "def invoke_model(model_path, input_date):\n", + " with open(model_path, 'rb') as file:\n", + " model = pickle.load(file)\n", + " date_ordinal = pd.to_datetime(input_date).toordinal()\n", + " prediction = model.predict(np.array([[date_ordinal]]))\n", + " return prediction[0]" + ] + }, + { + "cell_type": "markdown", + "id": "5ddb75a3", + "metadata": {}, + "source": [ + "After defining this new version of our code we can execute the same steps as before to check if the error has been fixed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b0d71489-5604-437f-819c-328b8c1207ac", + "metadata": {}, + "outputs": [], + "source": [ + "# Step 1 - Generate synthetic data \n", + "df = generate_synthetic_data()\n", + "\n", + "# Step 2 - Split the dataset into training and test.\n", + "train, test = split_data(df)\n", + "\n", + "# Step 3 - Run the training function\n", + "model = train_model(train)" + ] + }, + { + "cell_type": "markdown", + "id": "47a92be4", + "metadata": {}, + "source": [ + "Great! No errors were found, we can proceed to finish our execution plan." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ecf46503-3b52-4b98-bb05-9037aab08d98", + "metadata": {}, + "outputs": [], + "source": [ + "# Test the evaluation function\n", + "rmse, mape = evaluate_model(model, test)\n", + "print(f'RMSE: {rmse}, MAPE: {mape}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1afed715-977d-45bf-87ee-342712d19778", + "metadata": {}, + "outputs": [], + "source": [ + "# Test the model saving function\n", + "save_model(model)" + ] + }, + { + "cell_type": "markdown", + "id": "4d5f8ccd", + "metadata": {}, + "source": [ + "Everything seems to be working fine! The last step we need to do is using the saved model file to run an invocation with a new value." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a2207287-8ea7-46fc-b8c1-07ce46937967", + "metadata": {}, + "outputs": [], + "source": [ + "# Define an example invocation\n", + "next_day_ordinal = df['date'].iloc[-1] + pd.Timedelta(days=1) # Predict next day\n", + "sample_data = next_day_ordinal.strftime('%Y-%m-%d') # Convert to string format\n", + "\n", + "# Test the invocation function\n", + "prediction = invoke_model('xgboost_model.pkl', sample_data)\n", + "print(f'Prediction for next day: {prediction}')" + ] + }, + { + "cell_type": "markdown", + "id": "34d52089", + "metadata": {}, + "source": [ + "### 3 - Code Explanation\n", + "Amazon Nova Premier can be a great ally when you need to understand code written by someone else, let's use the generated code and ask for the model a detailed explanation of what it is doing." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "298129fd", + "metadata": {}, + "outputs": [], + "source": [ + "input_msg = [{\n", + " \"text\": f\"\"\"\n", + "Give me a detailed explanation of the code below:\n", + "{response}\n", + "\n", + "At the end return a sequence diagram and a diagram of the execution steps used to run it, use mermaid to format the diagram.\n", + "\"\"\"}]\n", + "\n", + "explation_response = bedrock_run(input_msg)\n", + "print(explation_response)" + ] + }, + { + "cell_type": "markdown", + "id": "75c4fb3b", + "metadata": {}, + "source": [ + "We can also check the generated diagrams for a better understanding of the code." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fe1bb034", + "metadata": {}, + "outputs": [], + "source": [ + "import base64\n", + "from IPython.display import display_svg\n", + "from urllib.request import Request, urlopen\n", + "\n", + "\n", + "# Helper function to render mermaid diagrams easily\n", + "def mm(graph):\n", + " graphbytes = graph.encode(\"ascii\")\n", + " base64_bytes = base64.b64encode(graphbytes)\n", + " base64_string = base64_bytes.decode(\"ascii\")\n", + " url=\"https://mermaid.ink/svg/\" + base64_string\n", + " req=Request(url, headers={'User-Agent': 'IPython/Notebook'})\n", + " display_svg(urlopen(req).read().decode(), raw=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "58a8b1f0", + "metadata": {}, + "outputs": [], + "source": [ + "sequence_diagram = \"\"\"\n", + "sequenceDiagram\n", + " participant Main\n", + " participant DataGenerator\n", + " participant DataSplitter\n", + " participant ModelTrainer\n", + " participant ModelEvaluator\n", + " participant ModelSaver\n", + " participant ModelInvoker\n", + "\n", + " Main->>DataGenerator: generate_synthetic_data()\n", + " DataGenerator-->>Main: data\n", + "\n", + " Main->>DataSplitter: split_data(data)\n", + " DataSplitter-->>Main: train, test\n", + "\n", + " Main->>ModelTrainer: train_model(train)\n", + " ModelTrainer-->>Main: model\n", + "\n", + " Main->>ModelEvaluator: evaluate_model(model, test)\n", + " ModelEvaluator-->>Main: rmse, mape\n", + "\n", + " Main->>ModelSaver: save_model(model)\n", + "\n", + " Main->>ModelInvoker: invoke_model('xgboost_model.pkl', '2020-12-31')\n", + " ModelInvoker-->>Main: prediction\n", + "\"\"\"\n", + " \n", + "mm(sequence_diagram)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8c10ddb6", + "metadata": {}, + "outputs": [], + "source": [ + "execution_diagram = \"\"\"\n", + "graph LR\n", + " A[Start] --> B[Generate Synthetic Data]\n", + " B --> C[Split Data into Train and Test]\n", + " C --> D[Train XGBoost Model]\n", + " D --> E[Evaluate Model]\n", + " E --> F[Save Model]\n", + " F --> G[Invoke Model for Prediction]\n", + " G --> H[End]\n", + "\"\"\"\n", + "\n", + "mm(execution_diagram)" + ] + }, + { + "cell_type": "markdown", + "id": "e25e6041-1f6e-43f5-9a9c-c3c71b1f9a55", + "metadata": {}, + "source": [ + "### 4 - Code Execution (Tool Use)\n", + "\n", + "Now, let's simplify our code and make it tools to be used by Premier model.\n", + "\n", + "Let's start with pre-defined function `split_data`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "99d3ec59-6c18-486e-8a97-b0c276064166", + "metadata": {}, + "outputs": [], + "source": [ + "tool_config = {\n", + " \"tools\": [\n", + " {\n", + " \"toolSpec\": {\n", + " \"name\": \"split_data\",\n", + " \"description\": \"Split dataset into train/validation\",\n", + " \"inputSchema\": {\n", + " \"json\": {\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"data\": {\n", + " \"type\": \"dataframe\",\n", + " \"description\": \"Dataset\"\n", + " }\n", + " },\n", + " \"required\": [\n", + " \"data\"\n", + " ]\n", + " }\n", + " }\n", + " }\n", + " }\n", + " ]\n", + "}\n", + "\n", + "input_msg = [{\"text\": f\"\"\"Split my dataset using available tools\"\"\"}]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "adcacffa-04bd-4b4e-9335-26059381100e", + "metadata": {}, + "outputs": [], + "source": [ + "tool_response = bedrock_run(input_msg=input_msg, tool_config=tool_config)\n", + "print(tool_response)" + ] + }, + { + "cell_type": "markdown", + "id": "3261c7bc", + "metadata": {}, + "source": [ + "### 5 - Code Refactoring" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2e340cd3", + "metadata": {}, + "outputs": [], + "source": [ + "input_msg = [{\n", + " \"text\": f\"\"\"\n", + "Convert the python code below to R:\n", + "{response}\n", + "\n", + "Only return the R code, without any additional text or explanation.\n", + "Do not include any comments in the R code.\n", + "\"\"\"}]\n", + "\n", + "refactored_response = bedrock_run(input_msg)\n", + "print(refactored_response)" + ] + }, + { + "cell_type": "markdown", + "id": "0c619e13", + "metadata": {}, + "source": [ + "# Wrap-Up\n", + "\n", + "In this notebook, we've explored how Amazon Nova Premier on Amazon Bedrock can significantly enhance the software development lifecycle through various code-related capabilities. Let's summarize what we've learned:\n", + "\n", + "## Key Capabilities Demonstrated\n", + "\n", + "1. **Code Generation**: We saw how Nova Premier can generate complete, functional Python code for time series forecasting based on natural language requirements. The model produced a comprehensive solution that included data generation, model training, evaluation, and serialization.\n", + "2. **Code Debugging**: When our generated code encountered an error with XGBoost, we leveraged Nova Premier to diagnose and fix the issue. The model correctly identified that we needed to convert date values to ordinal format for proper model training.\n", + "3. **Code Explanation**: Nova Premier provided a detailed explanation of the generated code, breaking down each function's purpose and how they work together. The model even created sequence and execution diagrams to visualize the code flow.\n", + "4. **Tool Use (Function Calling)**: We demonstrated how Nova Premier can be configured to use tools, enabling it to interact with predefined functions like our data splitting capability.\n", + "5. **Code Refactoring**: Finally, we showed how Nova Premier can transform code between languages, converting our Python implementation to R while maintaining the same functionality.\n", + "\n", + "## Benefits for Developers\n", + "\n", + "- **Accelerated Development**: Nova Premier can significantly reduce the time needed to write boilerplate code and implement common patterns.\n", + "- **Learning Aid**: The detailed explanations and visualizations help developers understand complex code structures.\n", + "- **Error Resolution**: Quick identification and resolution of bugs speeds up the debugging process.\n", + "- **Cross-language Support**: The ability to translate between programming languages facilitates collaboration across teams with different technical backgrounds.\n", + "\n", + "## Next Steps\n", + "\n", + "To continue exploring Amazon Nova Premier's capabilities for code development:\n", + "\n", + "1. Try using it for more complex software engineering tasks\n", + "2. Experiment with different programming languages and frameworks\n", + "3. Integrate it into your CI/CD pipeline for code reviews and suggestions\n", + "4. Use it to document existing codebases or legacy systems\n", + "\n", + "Amazon Nova Premier represents a powerful addition to a developer's toolkit, helping to streamline development workflows, improve code quality, and accelerate project delivery while maintaining best practices in software engineering." + ] + }, + { + "cell_type": "markdown", + "id": "f319a30b", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/multimodal-understanding/code-patterns/img/nova_1.png b/multimodal-understanding/code-patterns/img/nova_1.png new file mode 100644 index 00000000..6b26a220 Binary files /dev/null and b/multimodal-understanding/code-patterns/img/nova_1.png differ diff --git a/multimodal-understanding/code-patterns/img/nova_2.png b/multimodal-understanding/code-patterns/img/nova_2.png new file mode 100644 index 00000000..a455160a Binary files /dev/null and b/multimodal-understanding/code-patterns/img/nova_2.png differ diff --git a/multimodal-understanding/code-patterns/img/nova_3.png b/multimodal-understanding/code-patterns/img/nova_3.png new file mode 100644 index 00000000..5ed526f3 Binary files /dev/null and b/multimodal-understanding/code-patterns/img/nova_3.png differ