# Lab 3.5: Evaluando un Prompt Template para un modelo externo

En esta sección del laboratorio, configurarás algunos monitores de **OpenScale** para evaluar el prompt template para un modelo externo y obtener métricas de calidad generativa y estado de salud del modelo.

### Prerrequisitos
* Credenciales de acceso a **OpenScale**
* Project ID y Asset ID

### Índice
1. Preparación
    - Librerías
    - Credenciales
2. Configuración de OpenScale
    - Conexión a OpenScale
    - Creación de mapeado a proyecto
    - Configuración de suscripción a _Prompt Template_
3. Evaluación en OpenScale
    - Evaluación
    - Visualización de evaluación (general)
    - Visualización de Generative AI Quality (general)
    - Visualización de Generative AI Quality (por record)

## 1. Preparación

### Librerías

In [None]:
! pip install --upgrade ibm-watson-openscale --no-cache

In [1]:
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import APIClient
from ibm_watson_openscale.base_classes import ApiRequestFailure

from IPython.display import display, Markdown
from rich import print
import pandas as pd

import os

### Credenciales

In [4]:
CPD_URL = "<EDIT>"
CPD_USERNAME = "<EDIT>"
CPD_APIKEY = "<EDIT>"

PROJECT_ID = os.getenv("PROJECT_ID", "<EDIT>")

ASSET_ID = "<EDIT>"

## 2. Configuración de OpenScale

### Conexión a OpenScale

In [4]:
authenticator = CloudPakForDataAuthenticator(
    url = CPD_URL,
    username = CPD_USERNAME,
    apikey = CPD_APIKEY,
    disable_ssl_verification = True
)
wos_client = APIClient(
    service_url = CPD_URL,
    authenticator = authenticator,
    service_instance_id = None
)

### Conexion a watsonx.ai

In [8]:
from ibm_watsonx_ai import APIClient as WatsonxAIClient, Credentials
from ibm_watsonx_ai.foundation_models.prompts import PromptTemplateManager

creds = Credentials(
    api_key=CPD_APIKEY,
    url=CPD_URL,
    username=CPD_USERNAME,
    instance_id="openshift"
)

watsonx_ai_client = WatsonxAIClient(credentials=creds, project_id=PROJECT_ID)

### Creación de mapeado a proyecto

In [5]:
try:
    wos_client.wos.add_instance_mapping(
        project_id = PROJECT_ID,
        service_instance_id = wos_client.service_instance_id
    )
except ApiRequestFailure as arf:
    if arf.response.status_code != 409:
        raise arf

### Configuración de suscripción a _Prompt Template_

In [None]:
prompt_mgr = PromptTemplateManager(api_client=watsonx_ai_client, project_id=PROJECT_ID)

# List prompt templates
templates_df = prompt_mgr.list(limit=5)  # Lists 5 most recent templates
prompt_template_id = templates_df.iloc[0]['ID'] #"<ID DEL PROMPT TEMPLATE A UTILIZAR>"
prompt_template_id

In [None]:
monitors = {
    "generative_ai_quality": {
        "parameters": {
            "min_sample_size": 10,
            "metrics_configuration": {                    
            }
        }
    }
}

wos_client.wos.execute_prompt_setup(
    prompt_template_asset_id = prompt_template_id,
    project_id = PROJECT_ID,
    supporting_monitors = monitors,
    problem_type = "generation",
    label_column = "reference",
    input_data_type = "unstructured_text",
    operational_space_id = "development"
).result.to_dict()

In [None]:
while True:
    response = wos_client.wos.get_prompt_setup(
        prompt_template_asset_id = prompt_template_id,
        project_id = PROJECT_ID
    ).result
    if response.status.state == "FINISHED":
        print("Finished prompt setup. The response is {}".format(response))
        break

In [None]:
subscription_id = response.subscription_id
mrm_monitor_instance_id = response.mrm_monitor_instance_id
wos_client.monitor_instances.show(target_target_id = subscription_id)

## 3. Evaluación en OpenScale

### Evaluación

In [None]:
df = pd.read_csv("https://raw.githubusercontent.com/maialenespi/TEL-content/main/evaluation-tickets.csv")
df.to_csv("evaluation-tickets.csv")

response = wos_client.monitor_instances.mrm.evaluate_risk(
    monitor_instance_id = mrm_monitor_instance_id,
    test_data_set_name = "data", 
    test_data_path = "evaluation-tickets.csv",
    content_type = "multipart/form-data",
    body = {},
    project_id = PROJECT_ID,
    includes_model_output = True,
    background_mode = False
)

In [None]:
wos_client.monitor_instances.mrm.get_risk_evaluation(mrm_monitor_instance_id, project_id = PROJECT_ID).result.to_dict()

### Visualización de evaluación (general)

In [None]:
wos_client.monitor_instances.show_metrics(
    monitor_instance_id = mrm_monitor_instance_id
)

### Visualización de Generative AI Quality (general)

In [None]:
monitor_definition_id = "generative_ai_quality"

genaiq_monitor_id = wos_client.monitor_instances.list(
    data_mart_id = wos_client.service_instance_id,
    monitor_definition_id = monitor_definition_id,
    target_target_id = subscription_id,
    project_id = PROJECT_ID
).result.monitor_instances[0].metadata.id

wos_client.monitor_instances.show_metrics(
    monitor_instance_id = genaiq_monitor_id
)

### Visualización de Generative AI Quality (por record)

In [None]:
genaiq_dataset_id = wos_client.data_sets.list(
    target_target_id = subscription_id,
    target_target_type = "subscription",
    type = "gen_ai_quality_metrics"
).result.data_sets[0].metadata.id

wos_client.data_sets.show_records(
    data_set_id = genaiq_dataset_id
)

## 4. Conclusión

In [None]:
factsheets_url = f"{CPD_URL}/wx/prompt-details/{ASSET_ID}/factsheet?context=wx&project_id={PROJECT_ID}"
display(Markdown(f"[Pulsa aquí para ver la AI Factsheet publicada]({factsheets_url})"))