WebP Cloud Services Blog

WebP Cloud uses Cloudflare Workers to fetch content from the origin server in order to protect the origin server and start providing origin fetch time information.

这篇文章有简体中文版本,在: 使用 Cloudflare Workers 回源以保护回源服务器 IP

WebP Cloud Services offers two major services - Public Service and WebP Cloud, both of which use Cloudflare’s CDN to achieve relatively low latency responses globally and to protect our infrastructure (WAF, hiding the origin server’s IP, rate limiting, etc.). However, one missing piece in the entire process is the lack of protection by Cloudflare for our origin requests.

From the image above, when a visitor accesses an image not cached on WebP Cloud, WebP Cloud first requests the corresponding original image from the origin server and then outputs the image locally after conversion. For WebP Cloud, this step involves making an HTTP request directly to the origin server. This exposes the IP address of the machine used by WebP Cloud for origin fetching. If a malicious attacker attempts to attack this IP, leading to route blackholing, WebP Cloud will be unable to fetch the content correctly, indirectly causing service unavailability.

Security is our platform’s top priority, and protecting the infrastructure from attacks is a crucial part of security. This issue was considered from the outset, and we tried the following approaches:

  • Using Cloudflare WARP for origin fetching

    • We found WARP’s configuration to be somewhat complicated and it couldn’t run directly in a container. If it needs to run on the host, the SOCKS port provided by WARP listens on 127.0.0.1 on the host, requiring an additional socat container to forward it to our environment. There have been a few disconnection incidents while using it.
  • Using commercial VPNs for origin fetching

    • It was not very stable, as commercial VPNs do not seem to be designed for 24/7 connections. During use, there were multiple disconnection incidents, which were not very user-friendly.
  • Using Shadowsocks for origin fetching

    • This was the method we used before using Workers for origin fetching. We had multiple machines running Shadowsocks servers, and in our containers, we configured the http_proxy=socks5://shadowsocks:1080 environment variable. Since our program is written in Go, after adding this environment variable, all external requests would use the proxy for access. However, our origin server’s IP was still exposed, so it wasn’t a perfect solution.

In the end, we considered using Workers for origin fetching, and the specific process is as follows:

  • Write a Worker and deploy it to Cloudflare.
  • The Worker accepts a POST request regarding the required origin information.
  • The Worker parses the corresponding request and sends an HTTP request to the origin server to fetch the image.
  • The Worker returns the image to WebP Cloud.

In this logic, the origin server only sees the origin IP of Cloudflare Worker. For example, a possible header might look like this:

Received a request from: 162.158.163.132
Host: test.nova.moe
Connection: Keep-Alive
accept-encoding: gzip
X-Forwarded-For: 2a06:98c0:3600::103
CF-RAY: 81d0f5b4853a5f3b-SIN
X-Forwarded-Proto: http
CF-Visitor: {"scheme":"http"}
CF-EW-Via: 15
CDN-Loop: cloudflare; subreqs=1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: WebP Cloud Services - Dev
cf-worker: webp.se
CF-Connecting-IP: 2a06:98c0:3600::103

The implementation of the Worker is also straightforward, involving parsing and passing the corresponding headers after receiving the POST request. A simplified example might look like this:

async function handleProxy(post_body) {
	const headers = {
		"Accept": post_body.accept,
		"User-Agent": post_body.user_agent
	};

	const response = await fetch(post_body.origin_url, {
		method: post_body.request_name,
		headers: headers
	});

	if (response.ok) {
		const res = new Response(response.body, {
			status: response.status,
			statusText: response.statusText,
			headers: response.headers
		});
		return res;
	} else {
		return new Response(response.statusText || "Unknown Error", {
			status: response.status,
			statusText: response.statusText
		});
	}
}

export default {
	async fetch(request, env, ctx) {
		// {
		// "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
		// "user_agent": "WebP Cloud Services/1.0",
		// "origin_url": "https://docs.webp.sh/images/webp_server.jpg",
		// "request_name": "GET"
		// }
		try {
			const post_body = await request.json();
			return handleProxy(post_body);
		} catch (error) {
			return new Response("Invalid JSON data", { status: 400 });
		}
	},
};

With Worker-origin fetching in place, our infrastructure is now entirely concealed within Cloudflare’s network, further reducing a potential attack surface and making our origin requests more stable.

In addition, recently, some of our customers have inquired about the origin fetch time for each request, hoping to use it to understand the entire request lifecycle (from the moment the request is sent from the browser, passing through Cloudflare to reach WebP Cloud, to WebP Cloud completing the origin fetch, transforming the image, and returning it). This time, we have also added corresponding debugging information, and currently, we have included two additional headers in the response data:

  • x-webpcloud-cost
  • x-webpcloud-fetch-cost

Here’s an example of a request:

curl -I -H "Accept: image/webp" https://1303b8f.webp.li/pics/fk7-model3/a052-3.JPG
HTTP/2 200 
...
content-type: image/webp
content-length: 514932
access-control-allow-origin: *
x-powered-by: WebP Cloud Services (HIO)
x-webpcloud-cache: Miss
x-webpcloud-cost: 906
x-webpcloud-fetch-cost: 593
etag: W/"516265-4FEF7623"
...

x-webpcloud-cost represents the time WebP Cloud takes from receiving the request until completing the response, while x-webpcloud-fetch-cost indicates the time our service spends on origin fetching, both measured in milliseconds (ms).


The WebP Cloud Services team is a small team of three individuals from Shanghai and Helsingborg. Since we are not funded and have no profit pressure, we remain committed to doing what we believe is right. We strive to do our best within the scope of our resources and capabilities. We also engage in various activities without affecting the services we provide to the public, and we continuously explore novel ideas in our products.

If you find this service interesting, feel free to log in to the WebP Cloud Dashboard to experience it. If you’re curious about other magical features it offers, take a look at our WebP Cloud Services Docs. We hope everyone enjoys using it!


Discuss on Hacker News