Domain 2: Using OCI Generative AI Service (40%)
Domain 2 of the 1Z0-1127-25 Oracle Cloud Infrastructure 2025 Generative AI Professional exam covers the OCI Generative AI managed service end to end: pretrained models, dedicated AI clusters, fine-tuning, endpoints, inference APIs, security, and the Playground. At 40% of the exam (approximately 20 out of 50 questions), this is by far the heaviest domain. The exam syllabus identifies these topic areas:
- Chat and embedding foundational models
- Dedicated AI clusters (fine-tuning and hosting)
- Fine-tuning base models with custom datasets
- Model endpoints and deployment
- Inference API parameters
- Security architecture and IAM policies
- OCI GenAI Playground
The exam format is 50 multiple-choice questions in 90 minutes with a passing score of 68%. Questions are scenario-based -- expect questions that test specific model IDs, parameter ranges, cluster unit types, and IAM resource names.
1. OCI Generative AI Service Fundamentals
OCI Generative AI is a fully managed Oracle Cloud service providing state-of-the-art large language models for chat, text generation, text embedding, and reranking. The service is accessed through the OCI Console (Playground), REST APIs, OCI CLI, and SDKs (Python, Java). (Overview)
Console path: Navigation Menu > Analytics & AI > AI Services > Generative AI
Two Operating Modes
| Mode | Description | Use Case |
|---|---|---|
| On-Demand | Pay-per-inference; shared infrastructure; no cluster setup | Experimentation, PoC, model evaluation |
| Dedicated AI Cluster | Single-tenant GPU resources; customer-exclusive | Production workloads, fine-tuning, custom model hosting |
On-demand mode caps response length at 4,000 tokens per run. Dedicated mode is uncapped up to the model's full context window. (Concepts)
Exam trap: On-demand text generation and summarization APIs are retired. Only chat and embedding are available on-demand. Generation/summarization models (e.g., cohere.command) can still run on dedicated clusters but not on-demand.
2. Pretrained Foundational Models
The service offers models from multiple providers. For exam purposes, the core models to know are the Cohere and Meta families. (Pretrained Models)
2.1 Chat Models
Cohere Command Family
| Model | Model ID | Context Window | Key Capabilities |
|---|---|---|---|
| Command A (03-2025) | cohere.command-a-03-2025 |
256K tokens | Most performant Cohere chat; agentic enterprise tasks |
| Command R+ (08-2024) | cohere.command-r-plus-08-2024 |
128K tokens | Complex tasks, Q&A, sentiment analysis, multilingual RAG |
| Command R (08-2024) | cohere.command-r-08-2024 |
128K tokens | Same capabilities as R+; more cost-efficient; supports fine-tuning |
| Command R (16K) | cohere.command-r-16k |
16K tokens | Retired -- general language tasks |
| Command R+ | cohere.command-r-plus |
128K tokens | Retired |
Exam trap: The older cohere.command-r-16k model has a 16K context window, not 128K. Do not confuse it with cohere.command-r-08-2024 which has 128K.
Meta Llama Family
| Model | Model ID | Parameters | Context Window | Key Capabilities |
|---|---|---|---|---|
| Llama 3.3 (70B) | meta.llama-3.3-70b-instruct |
70B | 128K | Best 70B performance; on-demand and dedicated; supports fine-tuning |
| Llama 3.2 (90B Vision) | meta.llama-3.2-90b-vision-instruct |
90B | 128K | Multimodal (text + image) |
| Llama 3.2 (11B Vision) | meta.llama-3.2-11b-vision-instruct |
11B | 128K | Compact multimodal; dedicated only |
| Llama 3.1 (405B) | meta.llama-3.1-405b-instruct |
405B | 128K | Largest; advanced reasoning, coding, math, tool use |
| Llama 3.1 (70B) | meta.llama-3.1-70b-instruct |
70B | 128K | Retired -- predecessor to 3.3 |
Exam trap: Llama 3.1 405B on-demand is only available in US Midwest (Chicago). All other regions require a dedicated cluster. The required cluster unit type is Large Generic 2 (not Large Generic 4, which was the older type).
Additional Providers (Newer Additions)
The service also offers Google Gemini (via Oracle Interconnect for Google Cloud, on-demand only), OpenAI gpt-oss models, and xAI Grok models. These are noted here for completeness but are less likely to be heavily tested since the exam syllabus was written around the Cohere/Meta core.
2.2 Embedding Models
All embedding models in OCI GenAI are Cohere Embed models. (Embed Models)
| Model | Model ID | Dimensions | Input | Notes |
|---|---|---|---|---|
| Embed 4 | cohere.embed-v4.0 |
256, 512, 1024, 1536 (configurable) | Text + Image | Latest; multimodal; configurable output dimensions |
| Embed English 3 | cohere.embed-english-v3.0 |
1024 | Text only | English; 512 tokens/input; max 96 inputs/run |
| Embed English Light 3 | cohere.embed-english-light-v3.0 |
384 | Text only | Lightweight variant |
| Embed Multilingual 3 | cohere.embed-multilingual-v3.0 |
1024 | Text only | 100+ languages |
| Embed Multilingual Light 3 | cohere.embed-multilingual-light-v3.0 |
384 | Text only | Lightweight multilingual |
| Embed English Image 3 | cohere.embed-english-image-v3.0 |
1024 | Text + Image | Multimodal English |
| Embed Multilingual Image 3 | cohere.embed-multilingual-image-v3.0 |
1024 | Text + Image | Multimodal multilingual |
Key facts for the exam:
- Standard models output 1024 dimensions; Light models output 384 dimensions
- Maximum 96 inputs per run for text-only models
- Maximum 512 tokens per input for text-only Embed v3 models
- Text + image models support up to 128,000 tokens total across all inputs
- A 512x512 image consumes approximately 1,610 tokens
- Image input is API only -- not available in the Console Playground
- Embedding models cannot be fine-tuned
2.3 Rerank Model
| Model | Model ID | Function |
|---|---|---|
| Cohere Rerank 3.5 | cohere.rerank.v3-5 |
Takes a query + list of texts, returns ranked array with relevance scores |
Reranking is used in RAG pipelines to re-order retrieved documents by relevance before passing them to the LLM.
3. Dedicated AI Clusters
Dedicated AI clusters are single-tenant GPU compute resources for fine-tuning custom models or hosting endpoints. They are not shared with other tenancies. (Managing Dedicated AI Clusters)
3.1 Cluster Types
| Type | Purpose | GPU Requirement |
|---|---|---|
| Fine-tuning | Train custom models from base models | Higher GPU count than hosting |
| Hosting | Serve endpoints for pretrained, custom, or imported models | Lower GPU count |
Exam trap: Fine-tuning clusters require significantly more GPU resources than hosting clusters. You cannot use a hosting cluster for fine-tuning.
3.2 GPU Unit Shapes
Unit shape names follow the format: <Instance Type>_<Number of Cards>. Examples: H100_X1 = H100 with 1 card. For A100 shapes, the memory size distinguishes variants: A100-80G vs A100-40G. The unit shape cannot be changed after cluster creation. (Creating Hosting Clusters)
3.3 Cluster Unit Types by Model
Each model requires a specific cluster unit type. These are critical for the exam:
| Model | Hosting Unit Type | Units for Hosting | Fine-Tuning Units | Fine-Tuning Method |
|---|---|---|---|---|
cohere.command-r-08-2024 |
Small Cohere V2 | 1 | 8 | T-Few or LoRA |
cohere.command-r-plus-08-2024 |
Large Cohere V2_2 | 1 | N/A | Not supported |
meta.llama-3.3-70b-instruct |
Large Generic | 1 | LoRA units | LoRA |
meta.llama-3.1-405b-instruct |
Large Generic 2 | 1 (x4 multiplier) | N/A | Not supported |
cohere.embed-english-v3.0 |
Embed Cohere | 1 | N/A | Not supported |
Key facts:
- Maximum 50 endpoints per cluster (increase requestable)
- Multiple endpoints on the same cluster must use the same base model -- you cannot mix base models and custom models on one cluster
- Model replicas: Increase throughput by adding units (each replica = 1 additional unit)
- Cluster creation requires accepting commitment unit hours terms
3.4 Capacity and Scaling
- Default: 1 unit created per cluster
- Scale up by editing the cluster to add model replicas
- Each replica increases throughput proportionally
- Service limits control maximum units per shape (e.g.,
dedicated-unit-small-cohere-count,dedicated-unit-llama2-70-count)
Exam trap: To increase the Llama 3.1 405B hosting limit, you must request an increase of 4 units (not 1) because the multiplier is x4.
4. Fine-Tuning Base Models
Fine-tuning creates a custom model by training a copy of a pretrained base model on your own dataset. (Fine-Tune Models)
4.1 Fine-Tuning Methods
OCI GenAI supports two fine-tuning methods. The system automatically selects the method based on the chosen base model -- you do not manually choose. (Selecting a Fine-Tuning Method)
| Method | Supported Models | Description |
|---|---|---|
| T-Few | cohere.command-r-08-2024 |
Adds learned vectors to transformer attention; trains only a few additional parameters. Oracle's efficient approach for Cohere models. |
| LoRA (Low-Rank Adaptation) | cohere.command-r-08-2024, meta.llama-3.3-70b-instruct, meta.llama-3.1-70b-instruct |
Adds low-rank update matrices to attention layers; widely used parameter-efficient method. |
Exam trap: cohere.command-r-08-2024 supports both T-Few and LoRA. The Llama models support only LoRA. cohere.command-r-plus-08-2024 does not support fine-tuning at all.
4.2 Training Dataset Requirements
| Requirement | Specification |
|---|---|
| File format | JSONL (JSON Lines) |
| Encoding | UTF-8 |
| Line format | {"prompt": "<prompt>", "completion": "<response>"} |
| Minimum samples | 32 prompt/completion pairs |
| Maximum datasets per model | 1 |
| Data split | Automatic: 80% training / 20% validation |
| Storage | OCI Object Storage bucket |
| Fine-tuning token limits (Command R) | Prompt up to 16,000 tokens; completion up to 4,000 tokens |
Exam trap: The dataset must have exactly two fields: prompt and completion. Any other field structure will fail. The minimum is 32 pairs -- not 10, not 100.
4.3 Fine-Tuning Workflow
- Create the training dataset in JSONL format
- Upload the dataset to an OCI Object Storage bucket
- Create a fine-tuning dedicated AI cluster (select the base model)
- Create a new custom model (or new version of existing model)
- Create a hosting dedicated AI cluster
- Create an endpoint for the custom model on the hosting cluster
- Test in the Playground or call via API
4.4 Hyperparameters
LoRA Hyperparameters (Meta Llama Models)
| Parameter | Range | Default | Description |
|---|---|---|---|
| Total training epochs | 1+ (integer) | 3 | Iterations through entire dataset |
| Learning rate | 0 to 1.0 | 0.0002 | Speed of weight updates |
| Training batch size | 8 to 16 | 8 | Samples per mini-batch |
| Early stopping patience | 0 or 1+ | 15 | Grace periods after threshold; 0 disables |
| Early stopping threshold | 0 or positive | 0.0001 | Minimum loss improvement |
| LoRA r (rank) | 1 to 64 | 8 | Attention dimension of update matrices |
| LoRA alpha | 1 to 128 | 8 | Scaling parameter (weight = alpha / r) |
| LoRA dropout | 0 to < 1 | 0.1 | Dropout probability for LoRA layers |
| Log interval | Fixed | 10 steps | Not tunable |
T-Few Hyperparameters (Cohere Models)
| Parameter | Range | Default | Description |
|---|---|---|---|
| Total training epochs | 1 to 10 | 1 | Iterations through entire dataset |
| Learning rate | 0.000005 to 0.1 | 0.01 | Speed of weight updates |
| Training batch size | 8 to 32 | 16 | Samples per mini-batch |
| Early stopping patience | 0 or 1 to 16 | 10 | Grace periods after threshold; 0 disables |
| Early stopping threshold | 0.001 to 0.1 | 0.001 | Minimum loss improvement |
| Log interval | Fixed | 1 step | Not tunable |
Total training steps formula:
totalTrainingSteps = (totalTrainingEpochs * datasetSize) / trainingBatchSize
Exam trap: T-Few defaults to 1 epoch with learning rate 0.01. LoRA defaults to 3 epochs with learning rate 0.0002. These are very different -- know which is which. Also note T-Few batch size range is 8-32 (default 16) while LoRA is 8-16 (default 8).
4.5 Fine-Tuning vs. Prompt Engineering
| Criteria | Prompt Engineering | Fine-Tuning |
|---|---|---|
| Cost | Low (no training) | High (dedicated cluster + training time) |
| Data required | None (examples in prompt) | Minimum 32 labeled samples |
| Setup time | Immediate | Hours to train |
| Best for | General tasks, format control | Domain-specific knowledge, specialized outputs |
| Model change | None | Creates new custom model |
| Maintenance | Update prompts as needed | Retrain when data changes |
5. Creating Model Endpoints
An endpoint makes a model available for inference. Every model (pretrained, custom, or imported) requires an endpoint on a dedicated cluster for dedicated mode. On-demand models do not require explicit endpoint creation. (Creating Endpoints)
5.1 Endpoint Types
| Type | Availability | Description |
|---|---|---|
| Public endpoint | All model types | Default; accessible over the internet |
| Private endpoint | Pretrained and custom models only | Runs inside a VCN private subnet; requires pre-created private endpoint resource |
Exam trap: Imported models support public endpoints only -- private endpoints are not available for imported models.
5.2 Endpoint Configuration
Key settings during endpoint creation:
- Compartment: Where the endpoint lives (recommended: same as model)
- Model selection: Choose pretrained, custom, or imported model
- Dedicated AI cluster: Select existing active cluster or create new one
- Networking: Public (default) or Private endpoint
- Guardrails: Content moderation, prompt injection defense, PII handling (pretrained and custom only)
- Name: Auto-generated as
generativeaiendpoint<timestamp>if not specified
5.3 Guardrails
Guardrails are optional safety controls configurable at the endpoint level for pretrained and custom models. They are not available for imported models. (Guardrails)
| Guardrail | Function | Scoring |
|---|---|---|
| Content Moderation (CM) | Detects hate, harassment, sexual content, violence, self-harm | Binary: 0.0 (safe) / 1.0 (unsafe) + BLOCKLIST check |
| Prompt Injection (PI) | Detects "ignore previous instructions", system prompt exfiltration, hidden instructions | Binary: 0.0 / 1.0 |
| PII Detection | Identifies names, emails, phone numbers, IDs, financial data | Confidence score 0.0-1.0 per detected entity |
Key facts:
- Guardrails are disabled by default -- must be explicitly enabled
- Enabled via the ApplyGuardrails API or during endpoint creation in Console
- PII detection returns specific fields:
text,label,offset,length,score - Content Moderation tested on RTPLX dataset (38+ languages)
6. Inference API
The OCI Generative AI Inference API provides three primary operations: Chat, EmbedText, and ApplyGuardrails.
6.1 Chat API Parameters
These parameters control model output behavior. Know the ranges and defaults for each. (Cohere Command R+ (08-2024), Meta Llama 3.1 (405B))
| Parameter | Description | Cohere Default | Llama Default |
|---|---|---|---|
| Temperature | Controls randomness; 0 = deterministic, higher = more creative | Start at 0 or < 1 | Start at 0 or < 1 |
| Top P | Nucleus sampling; cumulative probability threshold (0-1) | Model-specific | Model-specific |
| Top K | Samples from top K most likely tokens | 0 (disabled; consider all) | -1 (consider all) |
| Frequency Penalty | Penalizes frequently appearing tokens; reduces repetition | 0 | 0 |
| Presence Penalty | Penalizes tokens already used; encourages diversity | 0 | 0 |
| Max Output Tokens | Maximum tokens generated per response | Up to 4,000 (on-demand) / 128K (dedicated) | Up to 4,000 (on-demand) / 128K (dedicated) |
| Seed | Deterministic output; Console max 9,999; API unlimited | null | null |
| Stop Sequences | Token sequences that halt generation | None | None |
Exam trap: Top K default for Cohere is 0 (disabled). Top K default for Llama is -1 (consider all tokens). Both effectively consider all tokens, but the numeric defaults differ.
6.2 Cohere-Specific Parameters
| Parameter | Values | Description |
|---|---|---|
| Preamble Override | Free text | System prompt; defaults to "You are Command..." |
| Safety Mode | CONTEXTUAL (default), STRICT, OFF |
Controls content safety filtering |
- Contextual: Fewer constraints; allows profanity and explicit content (entertainment/academic)
- Strict: Avoids sensitive topics (corporate/customer-facing)
- Off: No safety filtering
6.3 Embedding API (EmbedText)
Required parameters:
inputs(List[str]): Texts to embed (max 512 tokens each for v3 text-only)compartment_id: Target compartment OCIDserving_mode: On-demand or dedicated
Optional parameters:
| Parameter | Values | Description |
|---|---|---|
| input_type | SEARCH_DOCUMENT, SEARCH_QUERY, CLASSIFICATION, CLUSTERING, IMAGE |
Optimizes embedding for intended use |
| truncate | NONE (default), START, END |
Behavior when input exceeds token limit |
| embedding_types | float, int8, uint8, binary, ubinary, base64 |
Output format |
| output_dimensions | 256, 512, 1024, 1536 | Only for Embed v4+ models |
| is_echo | Boolean | Include original inputs in response |
Exam trap: The input_type parameter is critical for RAG. Use SEARCH_DOCUMENT when embedding documents for storage. Use SEARCH_QUERY when embedding the user's search query. Using the wrong type degrades retrieval quality.
6.4 Token Estimation
Approximately 4 characters per token. This is a rough estimate used consistently across OCI GenAI documentation.
6.5 On-Demand vs. Dedicated Response Limits
| Mode | Max Response Tokens | Context Window |
|---|---|---|
| On-demand | 4,000 tokens | Model's full context (e.g., 128K) |
| Dedicated | Uncapped (up to context window) | Model's full context |
7. Security Architecture
7.1 IAM Policies
OCI GenAI uses standard OCI IAM for access control. Only the Administrators group has access by default; all other users require explicit policies. (IAM Policies)
Aggregate resource type: generative-ai-family (covers all 11 individual resource types)
Individual Resource Types
| Resource Type | Controls |
|---|---|
generative-ai-chat |
Chat inference |
generative-ai-text-generation |
Text generation inference |
generative-ai-text-summarization |
Summarization inference |
generative-ai-text-embedding |
Embedding inference |
generative-ai-model |
Custom models |
generative-ai-imported-model |
Imported models |
generative-ai-dedicated-ai-cluster |
Dedicated AI clusters |
generative-ai-endpoint |
Model endpoints |
generative-ai-private-endpoint |
Private endpoints |
generative-ai-apikey |
API keys |
generative-ai-work-request |
Work requests |
Permission Verbs (Cumulative)
| Verb | Includes | Operations |
|---|---|---|
| inspect | -- | List resources |
| read | inspect | View details |
| use | read | Update, invoke inference |
| manage | use | Create, delete, move |
Common policy examples:
-- Full access at tenancy level
allow group GenAI-Admins to manage generative-ai-family in tenancy
-- Compartment-scoped inference only
allow group GenAI-Users to use generative-ai-chat in compartment AI-Prod
-- Embedding inference only
allow group GenAI-Users to use generative-ai-text-embedding in compartment AI-Prod
-- Manage clusters and endpoints
allow group GenAI-Ops to manage generative-ai-dedicated-ai-cluster in compartment AI-Prod
allow group GenAI-Ops to manage generative-ai-endpoint in compartment AI-Prod
Exam trap: To use the chat API, the verb is use on resource generative-ai-chat, not manage. The use verb is sufficient for inference. manage is only needed for creating/deleting resources.
Fine-Tuning Data Access
Training datasets in Object Storage require separate policies:
-- Upload datasets
allow group GenAI-Admins to manage object-family in compartment Data-Bucket
-- Read datasets during model creation
allow group GenAI-Admins to use object-family in compartment Data-Bucket
If training data and custom models are in different compartments, the user creating the model needs use object-family in the compartment containing the bucket.
7.2 Network Security
Private Endpoints: Restrict model access to traffic from within a VCN. (Prerequisites for Private Endpoints)
Prerequisites:
- Create a VCN in the tenancy
- Create a private subnet in the VCN
- IAM policies:
manage generative-ai-private-endpointandmanage virtual-network-family
The private endpoint is deployed as a VNIC in the private subnet. You manage the subnet's security rules and can optionally add network security groups (NSGs).
Service Gateway: For private network access to OCI services without internet traversal, use a VCN service gateway to reach OCI Generative AI through the Oracle Services Network.
7.3 Data Privacy and Model Isolation
- Dedicated AI clusters are single-tenant -- your data and models are not shared
- Custom model training data stays in your Object Storage bucket
- Fine-tuned models are private to your tenancy
- Guardrails provide additional content and PII controls
8. OCI GenAI Playground
The Playground is the Console-based no-code interface for testing models. It supports three modes:
| Mode | Function | Available Models |
|---|---|---|
| Chat | Conversational interaction with chat models | All chat models (Cohere, Llama, etc.) |
| Embedding | Generate text/image embeddings | All Cohere Embed models |
| Generation | Text generation (legacy) | Retired from on-demand; dedicated clusters only |
Playground Features
- Parameter tuning: Adjust temperature, top-p, top-k, penalties, max tokens in the UI
- Token display: Shows input and output token counts after each generation
- Code export: "View code" generates code in multiple languages (Python, Java, etc.) with authentication pre-configured
- Vision support: Upload
.pngor.jpgimages (max 5 MB) for multimodal models - Example prompts: Pre-built prompts for quick testing
- Embedding visualization: 2D projection of embedding vectors showing semantic similarity
Embedding Playground specifics:
- Maximum 96 inputs per run
- File upload:
.txtfiles only, newline-separated entries - Truncate parameter defaults to
NONE(returns error if exceeded) - Export embeddings as JSON
Exam trap: In the embedding Playground, the Truncate parameter resets to NONE every time you click Clear. You must re-set it before each run if you want truncation.
9. Regional Availability
Model availability varies significantly by region. Key patterns to know: (Models by Region)
| Region | On-Demand | Dedicated | Notes |
|---|---|---|---|
| US Midwest (Chicago) | Broadest on-demand availability | Full dedicated support | Primary region for all models |
| US East (Ashburn) | Limited | Dedicated for most models | No on-demand for many models |
| Germany Central (Frankfurt) | Google Gemini (via interconnect) | Cohere, Llama, OpenAI | EU data residency |
| UK South (London) | Limited | Cohere, Llama | |
| Japan Central (Osaka) | Limited | Cohere, Llama | APAC |
Availability symbols in Oracle docs:
- Check mark: Available (on-demand and dedicated)
- Check mark + o: On-demand only
- Check mark + d: Dedicated AI clusters only
- Check mark + G: Available through Oracle Interconnect for Google Cloud only
Exam trap: Google Gemini and xAI Grok models make external calls -- they route through Google Cloud or xAI infrastructure respectively. If data sovereignty is a concern, these models may not be appropriate. Cohere and Meta models run on OCI infrastructure.
10. Exam Focus: Common Traps and Pitfalls
On-demand response cap: Always 4,000 tokens. Dedicated mode is uncapped. If a question asks about maximum response length, the answer depends on the deployment mode.
Fine-tuning model support: Only
cohere.command-r-08-2024,meta.llama-3.3-70b-instruct, andmeta.llama-3.1-70b-instructsupport fine-tuning. Command R+ does not. The 405B model does not. Embedding models do not.T-Few vs. LoRA: The system auto-selects. T-Few is for Cohere; LoRA is for Llama (and also available for Cohere Command R 08-2024). You do not manually pick during model creation.
JSONL format: The only accepted format for training data is JSONL with exactly
{"prompt": "...", "completion": "..."}. Minimum 32 pairs. One dataset per model. Auto-split 80/20.Cluster unit types: Each model has a specific required unit type. You cannot mix models requiring different unit types on the same cluster. Know
Small Cohere V2,Large Cohere V2_2,Large Generic 2,Embed Cohere.IAM resource names:
generative-ai-familyis the aggregate type. For chat inference, the resource isgenerative-ai-chatwith theuseverb. For managing clusters, you needmanageongenerative-ai-dedicated-ai-cluster.Private endpoints: Only for pretrained and custom models. Imported models are public only. Requires VCN + private subnet + IAM policies for both GenAI and virtual-network-family.
Embedding input_type:
SEARCH_DOCUMENTfor indexing documents,SEARCH_QUERYfor queries. Using the correct type optimizes retrieval. Also available:CLASSIFICATION,CLUSTERING,IMAGE.Embed dimensions: Standard = 1024, Light = 384. Embed v4 supports configurable dimensions (256, 512, 1024, 1536).
Guardrails disabled by default: You must explicitly enable content moderation, prompt injection defense, and PII detection. They do not run automatically.
References
- OCI Generative AI Service Overview
- Generative AI Concepts
- Pretrained Foundational Models
- Chat Models
- Embedding Models
- Cohere Command R+ (08-2024)
- Cohere Command R (08-2024)
- Meta Llama 3.1 (405B)
- Cohere Embed English 3
- Managing Dedicated AI Clusters
- Creating Hosting Clusters
- Creating Fine-Tuning Clusters
- Fine-Tuning Models
- Selecting a Fine-Tuning Method
- Fine-Tuning Hyperparameters
- Training Data Requirements
- Creating Endpoints
- IAM Policies for Generative AI
- Guardrails
- Models by Region
- Prerequisites for Private Endpoints
- EmbedTextDetails API Reference
- 1Z0-1127-25 Exam Syllabus (DBExam)
- 1Z0-1127-25 Exam Page (Oracle University)