LogoLogo
WebsitePredictoorData ChallengesData FarmingOcean.pyOcean.js
  • 👋Ocean docs
  • 🌊Discover Ocean
    • Why Ocean?
    • What is Ocean?
    • What can you do with Ocean?
    • OCEAN: The Ocean token
    • Networks
    • Network Bridges
    • FAQ
    • Glossary
  • 📚User Guides
    • Basic concepts
    • Using Wallets
      • Set Up MetaMask
    • Host Assets
      • Uploader
      • Arweave
      • AWS
      • Azure Cloud
      • Google Storage
      • Github
    • Liquidity Pools [deprecated]
  • 💻Developers
    • Architecture Overview
    • Ocean Nodes
      • Node Architecture
    • Contracts
      • Data NFTs
      • Datatokens
      • Data NFTs and Datatokens
      • Datatoken Templates
      • Roles
      • Pricing Schemas
      • Fees
    • Publish Flow Overview
    • Revenue
    • Fractional Ownership
    • Community Monetization
    • Metadata
    • Identifiers (DIDs)
    • New DDO Specification
    • Obsolete DDO Specification
    • Storage Specifications
    • Fine-Grained Permissions
    • Retrieve datatoken/data NFT addresses & Chain ID
    • Get API Keys for Blockchain Access
    • Barge
      • Local Setup
    • Ocean.js
      • Configuration
      • Creating a data NFT
      • Publish
      • Mint Datatokens
      • Update Metadata
      • Asset Visibility
      • Consume Asset
      • Run C2D Jobs
    • Ocean CLI
      • Install
      • Publish
      • Edit
      • Consume
      • Run C2D Jobs
    • DDO.js
      • Instantiate a DDO
      • DDO Fields interactions
      • Validate
      • Edit DDO Fields
    • Compute to data
    • Compute to data
    • Uploader
      • Uploader.js
      • Uploader UI
      • Uploader UI to Market
    • VSCode Extension
    • Old Infrastructure
      • Aquarius
        • Asset Requests
        • Chain Requests
        • Other Requests
      • Provider
        • General Endpoints
        • Encryption / Decryption
        • Compute Endpoints
        • Authentication Endpoints
      • Subgraph
        • Get data NFTs
        • Get data NFT information
        • Get datatokens
        • Get datatoken information
        • Get datatoken buyers
        • Get fixed-rate exchanges
        • Get veOCEAN stats
    • Developer FAQ
  • 📊Data Scientists
    • Ocean.py
      • Install
      • Local Setup
      • Remote Setup
      • Publish Flow
      • Consume Flow
      • Compute Flow
      • Ocean Instance Tech Details
      • Ocean Assets Tech Details
      • Ocean Compute Tech Details
      • Datatoken Interface Tech Details
    • Join a Data Challenge
    • Sponsor a Data Challenge
    • Data Value-Creation Loop
    • What data is valuable?
  • 👀Predictoor
  • 💰Data Farming
    • Predictoor DF
      • Guide to Predictoor DF
    • FAQ
  • 🔨Infrastructure
    • Set Up a Server
    • Deploy Aquarius
    • Deploy Provider
    • Deploy Ocean Subgraph
    • Deploy C2D
    • For C2D, Set Up Private Docker Registry
  • 🤝Contribute
    • Collaborators
    • Contributor Code of Conduct
    • Legal Requirements
Powered by GitBook
LogoLogo

Ocean Protocol

  • Website
  • Blog
  • Data Challenges

Community

  • Twitter
  • Discord
  • Telegram
  • Instagram

Resources

  • Whitepaper
  • GitHub
  • Docs

Copyright 2024 Ocean Protocol Foundation Ltd.

On this page
  • Requirements
  • Steps

Was this helpful?

Edit on GitHub
Export as PDF
  1. Infrastructure

Deploy C2D

Last updated 11 months ago

Was this helpful?

This chapter will present how to deploy the C2D component of the Ocean stack. As mentioned in the , the Compute-to-Data component uses Kubernetes to orchestrate the creation and deletion of the pods in which the C2D jobs are run.

For the ones that do not have a Kubernetes environment available, we added to this guide instructions on how to install Minikube, which is a lightweight Kubernetes implementation that creates a VM on your local machine and deploys a simple cluster containing only one node. In case you have a Kubernetes environment in place, please skip directly to step 4 of this guide.

Requirements

  • Communications: a functioning internet-accessible provider service

  • Hardware: a server capable of running compute jobs (e.g. we used a machine with 8 CPUs, 16 GB Ram, 100GB SSD, and a fast internet connection). See for how to create a server;

  • Operating system: Ubuntu 22.04 LTS

Steps

Install Docker and Git

sudo apt update
sudo apt install git docker.io
sudo usermod -aG docker $USER && newgrp docker

Install Minikube

wget -q --show-progress https://github.com/kubernetes/minikube/releases/download/v1.22.0/minikube_1.22.0-0_amd64.deb
sudo dpkg -i minikube_1.22.0-0_amd64.deb

Start Minikube

minikube config set kubernetes-version v1.16.0
minikube start --cni=calico --driver=docker --container-runtime=docker

Depending on the number of available CPUs, RAM, and the required resources for running the job, consider adding options --cpu, --memory, and --disk-size to avoid runtime issues.

Install the Kubernetes command line tool (kubectl)

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
echo "$(<kubectl.sha256) kubectl" | sha256sum --check

sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Wait until all the defaults are running (1/1).

watch kubectl get pods --all-namespaces

Download all required files

Create a folder, cd into it, and clone the following repositories:

mkdir computeToData
cd computeToData
git clone https://github.com/oceanprotocol/operator-service.git
git clone https://github.com/oceanprotocol/operator-engine.git

Create namespaces

In this tutorial, we are going to create only one environment, called ocean-compute.

kubectl create ns ocean-operator
kubectl create ns ocean-compute

Setup up Postgresql

For now, communication between different components is made through pgsql. This will change in the near future.

Edit operator-service/kubernetes/postgres-configmap.yaml. Change POSTGRES_PASSWORD to a nice long random password.

Then deploy pgsql

kubectl config set-context --current --namespace ocean-operator
kubectl create -f operator-service/kubernetes/postgres-configmap.yaml
kubectl create -f operator-service/kubernetes/postgres-storage.yaml
kubectl create -f operator-service/kubernetes/postgres-deployment.yaml
kubectl create -f operator-service/kubernetes/postgresql-service.yaml

Congrats, pgsql is running now.

Run the IPFS host (optional)

To store the results and the logs of the C2D jobs, you can use either an AWS S3 bucket or IPFS.

In case you want to use IPFS you need to run an IPFS host, as presented below.

export ipfs_staging=~/ipfs_staging
export ipfs_data=~/ipfs_data

docker run -d --name ipfs_host -v $ipfs_staging:/export -v $ipfs_data:/data/ipfs -p 4001:4001 -p 4001:4001/udp -p 127.0.0.1:8080:8080 -p 127.0.0.1:5001:5001 ipfs/go-ipfs:latest

sudo /bin/sh -c 'echo "127.0.0.1    youripfsserver" >> /etc/hosts'

Update the storage class

The storage class is used by Kubernetes to create the temporary volumes on which the data used by the algorithm will be stored.

Please ensure that your class allocates volumes in the same region and zone where you are running your pods.

You need to consider the storage class available for your environment.

For Minikube, you can use the default 'standard' class.

In AWS, we created our own 'standard' class:

kubectl get storageclass standard -o yaml
allowedTopologies:
- matchLabelExpressions:
    - key: failure-domain.beta.kubernetes.io/zone
          values:
          - us-east-1a
apiVersion: storage.k8s.io/v1
kind: StorageClass
parameters:
    fsType: ext4
    type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: Immediate

For more information, please visit https://kubernetes.io/docs/concepts/storage/storage-classes/

If you need to use your own classes, you will need to edit 'operator_engine/kubernetes/operator.yml'.

Setup C2D Orchestrator

C2D Orchestrator (aka operator-service) has two main functions:

  • First, it's the outside interface of your C2D Cluster to the world. External components(like Provider) are calling APIs exposed by this

  • Secondly, operator-service manages multiple environments and sends the jobs to the right environment.

Edit operator-service/kubernetes/deployment.yaml. Change ALLOWED_ADMINS to a nice long random password.

Let's deploy C2D Orchestrator.

kubectl config set-context --current --namespace ocean-operator
kubectl apply -f operator-service/kubernetes/deployment.yaml

Now, let's expose the service.

kubectl expose deployment operator-api --namespace=ocean-operator --port=8050

You can run a port forward in a new terminal (see below) or create your ingress service and setup DNS and certificates (not covered here):

kubectl -n ocean-operator port-forward svc/operator-api 8050

Alternatively you could use another method to communicate between the C2D Environment and the provider, such as an SSH tunnel.

And now it's time to initialize the database.

If your Minikube is running on compute.example.com:

curl -X POST "https://compute.example.com/api/v1/operator/pgsqlinit" -H "accept: application/json" -H "Admin: myAdminPass"

Congrats, you have operator-service running.

Setup your first environment

Let's create our first environment. Edit operator-service/kubernetes/deployment.yaml.

  • set OPERATOR_PRIVATE_KEY. This has to be unique among multiple environments. In the future, this will be the account credited with fees.

Finally, let's deploy it:

kubectl config set-context --current --namespace ocean-compute
kubectl create -f operator-service/kubernetes/postgres-configmap.yaml
kubectl apply  -f operator-engine/kubernetes/sa.yml
kubectl apply  -f operator-engine/kubernetes/binding.yml
kubectl apply  -f operator-engine/kubernetes/operator.yml

Optional: For production enviroments, it's safer to block access to metadata. To do so run the below command:

kubectl -n ocean-compute apply -f /ocean/operator-engine/kubernetes/egress.yaml

Congrats,your c2d environment is running.

If you want to deploy another one, just repeat the steps above, with a different namespace and different OPERATOR_PRIVATE_KEY.

Update Provider

Update your existing provider service by updating the operator_service.url value in config.ini, or set the appropiate ENV variable.

operator_service.url = https://compute.example.com/

Restart your provider service.

Automated deployment example

If your setup is more complex, you can checkout (our automated deployment example)[https://github.com/oceanprotocol/c2d_barge/blob/main/c2d_barge_deployer/docker-entrypoint.sh]. This script is used by barge to automaticly deploy the C2D cluster, with two environments.

The first command is important and solves a .

For other options to run minikube refer to this

(where myAdminPass is configured in )

optionally change more env variables, to customize your environment. Check the section of the operator engine to customize your deployment. At a minimum, you should add your IPFS URLs or AWS settings, and add (or remove) notification URLs.

🔨
PersistentVolumeClaims problem
link
README
Setup C2D Orchestrator
this guide
Install Docker and Git
Install Minikube
Start Minikube
Install the Kubernetes command line tool (kubectl)
Download all required files
Create namespaces
Setup up Postgresql
Run the IPFS host (optional)
Update the storage class
Setup C2D Orchestrator
Setup your first environment
Update Provider
Automated deployment example
C2D Architecture chapter