Welcome to CLIP-as-service!#
CLIP-as-service is a low-latency high-scalability service for embedding images and text. It can be easily integrated as a microservice into neural search solutions.
โก Fast: Serve CLIP models with TensorRT, ONNX runtime and PyTorch w/o JIT with 800QPS[*]. Non-blocking duplex streaming on requests and responses, designed for large data and long-running tasks.
๐ซ Elastic: Horizontally scale up and down multiple CLIP models on single GPU, with automatic load balancing.
๐ฅ Easy-to-use: No learning curve, minimalist design on client and server. Intuitive and consistent API for image and sentence embedding.
๐ Modern: Async client support. Easily switch between gRPC, HTTP, WebSocket protocols with TLS and compression.
๐ฑ Integration: Smooth integration with neural search ecosystem including Jina and DocArray. Build cross-modal and multi-modal solutions in no time.
[*] with default config (single replica, PyTorch no JIT) on GeForce RTX 3090.
Try it!#
Install#
is the latest version.
Make sure you are using Python 3.7+. You can install the client and server independently. It is not required to install both: e.g. you can install clip_server
on a GPU machine and clip_client
on a local laptop.
pip install clip-client
pip install clip-server
pip install "clip_server[onnx]"
pip install nvidia-pyindex
pip install "clip_server[tensorrt]"
Quick check#
After installing, you can run the following commands for a quick connectivity check.
Start the server#
python -m clip_server
python -m clip_server onnx-flow.yml
python -m clip_server tensorrt-flow.yml
At the first time starting the server, it will download the default pretrained model, which may take a while depending on your network speed. Then you will get the address information similar to the following:
โญโโโโโโโโโโโโโโ ๐ Endpoint โโโโโโโโโโโโโโโโฎ
โ ๐ Protocol GRPC โ
โ ๐ Local 0.0.0.0:51000 โ
โ ๐ Private 192.168.31.62:51000 โ
| ๐ Public 87.105.159.191:51000 |
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
This means the server is ready to serve. Note down the three addresses shown above, you will need them later.
Connect from client#
Tip
Depending on the location of the client and server. You may use different IP addresses:
Client and server are on the same machine: use local address, e.g.
0.0.0.0
Client and server are connected to the same router: use private network address, e.g.
192.168.3.62
Server is in public network: use public network address, e.g.
87.105.159.191
Run the following Python script:
from clip_client import Client
c = Client('grpc://0.0.0.0:51000')
c.profile()
will give you:
Roundtrip 16ms 100%
โโโ Client-server network 8ms 49%
โโโ Server 8ms 51%
โโโ Gateway-CLIP network 2ms 25%
โโโ CLIP model 6ms 75%
{'Roundtrip': 15.684750003856607, 'Client-server network': 7.684750003856607, 'Server': 8, 'Gateway-CLIP network': 2, 'CLIP model': 6}
It means the client and the server are now connected. Well done!
Support#
Join our Discord community and chat with other community members about ideas.
Watch our Engineering All Hands to learn Jinaโs new features and stay up-to-date with the latest AI techniques.
Subscribe to the latest video tutorials on our YouTube channel
Join Us#
CLIP-as-service is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in open-source.