What is Kestra?

Kestra is an open-source platform for managing data workflows and orchestrating complex data processing pipelines. It is designed to help data teams automate and manage complex workflows efficiently, including handling tasks like scheduling, monitoring, and troubleshooting end-to-end data processes.

Key Features of Kestra:

  • Workflow Management: Kestra allows you to define, schedule, and manage complex workflows, including executing tasks sequentially or in parallel.
  • Integration-Friendly: It supports integration with various data systems and cloud services.
  • Monitoring and Reporting: Kestra provides detailed monitoring tools, helping you track the status and performance of your workflows.
  • Scalability: With its distributed architecture, Kestra can scale to handle large workloads efficiently.
  • User-Friendly Interface: Kestra offers an intuitive web interface where you can design and manage workflows directly.

Kestra still not avalible version in cloud. You just clone it about your my laptop an run it in local. You can use a instace and create a domain for everyone to access.

You can install Kestra using different methods. Select one that matches your preferred environment.

You can deploy Kestra (almost) anywhere, from your laptop or an on-prem server to a distributed cluster running in a public cloud. Note that some kestra plugins such as the Script plugin require Docker-in-Docker (DinD). This is not supported in certain environments such as e.g. on AWS Fargate. For production deployments, we recommend using Kubernetes or a virtual machine.
The easiest way to install Kestra locally is to use Docker (recommend).

Install Kestra

1
docker run --rm -p 8080:8080 kestra/kestra

Start write IaC in Kestra

Some params usally:

  • id: name of your workflow
  • namespace: environment for your workflow
  • tasks: list tasks to execute in your flow

I will give me a template this workflow:

Tasks:

1
2
3
4
5
6
id: get_started
namespace: example
tasks:
- id: helloworld
type: io.kestra.core.tasks.log.log
message: Hello world!

Let’s ref my workflow, it look quite easy to understand

Inputs:

1
2
3
4
inputs:
- id: variable_name
type: STRING
default: example_string

{input.variable_name}

Outputs:
{{ output.task_id.vars.output_name }}

Triggers:

1
2
3
4
triggers:
- id: hour_trigger
type: io.kestra.core.models.triggers.type.Shedule
cron: 0 * * * *

Test with write a script python: api_example.py and try it with kestra

1
2
3
4
5
import requests as rq

r = rq.get("https://api.github.com/repos/kestra-io/kestra")
gh_star = r.json()['stargazers_count']
print(gh_star)

Test:

  1. Truy cập giao diện Kestra
  • Click Flow hoặc tạo luôn flow mới
  1. Tại trình edit flow, copy đoạn code dưới để chạy thử khai báo input
  • Nhấn SaveExecute
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
id: huytest
namespace: company.myteam
description: Test workflow for Kestra by send messange at console

inputs:
- id: user
type: STRING
defaults: Vu Huy
- id: app_url1
type: STRING
defaults: https://huyvu15.github.io
- id: api_url
type: STRING
defaults: https://dummyjson.com/products

tasks:
# - id: test
# type: io.kestra.plugin.core.log.Log
# message: hello world! {{ inputs.user }}

# - id: test_app
# type: io.kestra.plugin.fs.http.Request
# uri: "{{ inputs.app_url1 }}"

- id: api
type: io.kestra.plugin.fs.http.Request
uri: "{{ inputs.api_url }}"

- id: log_response
type: io.kestra.core.tasks.log.Log
message: "{{ outputs.api.body }}"

  1. Xác nhận lại thông tin và nhấn Execute

  2. Kết quả

  3. Tìm Outputs để xem thông tin api response

  1. Tìm thử {{ outputs.api.body }}
  • Nhớ là phải đúng 2 dấu space nha

Reference