pyzmq/perf/perf.ipynb

551 lines
35 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# pyzmq performance crossover\n",
"\n",
"Sample plots from running `python collect.py` script to generate performance data.\n",
"\n",
"PyZMQ's zero-copy implementation has a nontrivial overhead, due to the requirement to notify Python garbage collection when libzmq is done with a message from its IO thread. Performance optimizations over time and application/machine circumstances change where the crossover is, where zero-copy is more cost than benefit.\n",
"\n",
"pyzmq 17 introduces `zmq.COPY_THRESHOLD`, a performance-tuning threshold,\n",
"where messages will not be copied even if sent with `copy=False`.\n",
"Based on these experiments,\n",
"the default value for `zmq.COPY_THRESHOLD` in pyzmq 17.0 is 64kB,\n",
"which seems to be a common crossover point.\n",
"\n",
"In general, it is recommended to only use zero-copy for 'large' messages (at least 10s-100s of kB) because the bookkeeping overhead is significantly greater than small `memcpy` calls."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pickle\n",
"\n",
"import altair as alt\n",
"import pandas as pd\n",
"\n",
"\n",
"def crossover(data, column, ylabel=\"msgs/sec\"):\n",
" \"\"\"Plot the crossover for copy=True|False\"\"\"\n",
" return (\n",
" alt.Chart(data)\n",
" .mark_point()\n",
" .encode(\n",
" color=\"copy\",\n",
" x=alt.X(\"size\", title=\"size (B)\").scale(type=\"log\"),\n",
" y=alt.Y(column, title=ylabel).scale(type=\"log\"),\n",
" )\n",
" )\n",
"\n",
"\n",
"def relative(data, column, yscale=\"linear\"):\n",
" \"\"\"Plot a normalized value showing relative performance\"\"\"\n",
" copy_mean = data[data[\"copy\"]].groupby(\"size\")[column].mean()\n",
" no_copy = data[~data[\"copy\"]]\n",
" reference = copy_mean[no_copy[\"size\"]]\n",
" return (\n",
" alt.Chart(\n",
" pd.DataFrame(\n",
" {\n",
" \"size\": no_copy[\"size\"],\n",
" \"no-copy speedup\": no_copy[column] / reference.array,\n",
" }\n",
" )\n",
" )\n",
" .mark_point()\n",
" .encode(\n",
" x=alt.X(\"size\", title=\"size (B)\").scale(type=\"log\"),\n",
" y=alt.Y(\"no-copy speedup\", title=\"\").scale(type=yscale),\n",
" )\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Throughput\n",
"\n",
"Throughput tests measure sending messages on a PUSH-PULL pair as fast as possible. These numbers count the time from first `recv` to the last."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"with open(\"thr.pickle\", \"rb\") as f:\n",
" thr = pickle.load(f)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>size</th>\n",
" <th>count</th>\n",
" <th>copy</th>\n",
" <th>poll</th>\n",
" <th>transport</th>\n",
" <th>sends</th>\n",
" <th>throughput</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>100</td>\n",
" <td>655360</td>\n",
" <td>True</td>\n",
" <td>False</td>\n",
" <td>ipc</td>\n",
" <td>1.618701e+06</td>\n",
" <td>412531.609473</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>100</td>\n",
" <td>1310720</td>\n",
" <td>True</td>\n",
" <td>False</td>\n",
" <td>ipc</td>\n",
" <td>1.636403e+06</td>\n",
" <td>415309.020795</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>100</td>\n",
" <td>262144</td>\n",
" <td>False</td>\n",
" <td>False</td>\n",
" <td>ipc</td>\n",
" <td>1.781225e+05</td>\n",
" <td>178051.290985</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>100</td>\n",
" <td>524288</td>\n",
" <td>False</td>\n",
" <td>False</td>\n",
" <td>ipc</td>\n",
" <td>1.817957e+05</td>\n",
" <td>181758.942987</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>215</td>\n",
" <td>524288</td>\n",
" <td>True</td>\n",
" <td>False</td>\n",
" <td>ipc</td>\n",
" <td>1.599087e+06</td>\n",
" <td>408223.469186</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" size count copy poll transport sends throughput\n",
"0 100 655360 True False ipc 1.618701e+06 412531.609473\n",
"1 100 1310720 True False ipc 1.636403e+06 415309.020795\n",
"2 100 262144 False False ipc 1.781225e+05 178051.290985\n",
"3 100 524288 False False ipc 1.817957e+05 181758.942987\n",
"4 215 524288 True False ipc 1.599087e+06 408223.469186"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"thr.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the throughput performance vs msg size for copy/no-copy.\n",
"This should show us a crossover point where zero-copy starts to outperform copying."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"<style>\n",
" #altair-viz-62389b51eb9046bf908b3b3e6cd0d17d.vega-embed {\n",
" width: 100%;\n",
" display: flex;\n",
" }\n",
"\n",
" #altair-viz-62389b51eb9046bf908b3b3e6cd0d17d.vega-embed details,\n",
" #altair-viz-62389b51eb9046bf908b3b3e6cd0d17d.vega-embed details summary {\n",
" position: relative;\n",
" }\n",
"</style>\n",
"<div id=\"altair-viz-62389b51eb9046bf908b3b3e6cd0d17d\"></div>\n",
"<script type=\"text/javascript\">\n",
" var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
" (function(spec, embedOpt){\n",
" let outputDiv = document.currentScript.previousElementSibling;\n",
" if (outputDiv.id !== \"altair-viz-62389b51eb9046bf908b3b3e6cd0d17d\") {\n",
" outputDiv = document.getElementById(\"altair-viz-62389b51eb9046bf908b3b3e6cd0d17d\");\n",
" }\n",
" const paths = {\n",
" \"vega\": \"https://cdn.jsdelivr.net/npm/vega@5?noext\",\n",
" \"vega-lib\": \"https://cdn.jsdelivr.net/npm/vega-lib?noext\",\n",
" \"vega-lite\": \"https://cdn.jsdelivr.net/npm/vega-lite@5.16.3?noext\",\n",
" \"vega-embed\": \"https://cdn.jsdelivr.net/npm/vega-embed@6?noext\",\n",
" };\n",
"\n",
" function maybeLoadScript(lib, version) {\n",
" var key = `${lib.replace(\"-\", \"\")}_version`;\n",
" return (VEGA_DEBUG[key] == version) ?\n",
" Promise.resolve(paths[lib]) :\n",
" new Promise(function(resolve, reject) {\n",
" var s = document.createElement('script');\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" s.async = true;\n",
" s.onload = () => {\n",
" VEGA_DEBUG[key] = version;\n",
" return resolve(paths[lib]);\n",
" };\n",
" s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
" s.src = paths[lib];\n",
" });\n",
" }\n",
"\n",
" function showError(err) {\n",
" outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
" throw err;\n",
" }\n",
"\n",
" function displayChart(vegaEmbed) {\n",
" vegaEmbed(outputDiv, spec, embedOpt)\n",
" .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
" }\n",
"\n",
" if(typeof define === \"function\" && define.amd) {\n",
" requirejs.config({paths});\n",
" require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
" } else {\n",
" maybeLoadScript(\"vega\", \"5\")\n",
" .then(() => maybeLoadScript(\"vega-lite\", \"5.16.3\"))\n",
" .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
" .catch(showError)\n",
" .then(() => displayChart(vegaEmbed));\n",
" }\n",
" })({\"config\": {\"view\": {\"continuousWidth\": 300, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-0c43d3d4693e2c625c095c363474e051\"}, \"mark\": {\"type\": \"point\"}, \"encoding\": {\"color\": {\"field\": \"copy\", \"type\": \"nominal\"}, \"x\": {\"field\": \"size\", \"scale\": {\"type\": \"log\"}, \"title\": \"size (B)\", \"type\": \"quantitative\"}, \"y\": {\"field\": \"throughput\", \"scale\": {\"type\": \"log\"}, \"title\": \"msgs/sec\", \"type\": \"quantitative\"}}, \"title\": \"Throughput\", \"$schema\": \"https://vega.github.io/schema/vega-lite/v5.16.3.json\", \"datasets\": {\"data-0c43d3d4693e2c625c095c363474e051\": [{\"size\": 100, \"count\": 655360, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1618700.5118922223, \"throughput\": 412531.6094729082}, {\"size\": 100, \"count\": 1310720, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1636402.7520316425, \"throughput\": 415309.0207953903}, {\"size\": 100, \"count\": 262144, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 178122.46036972723, \"throughput\": 178051.29098538886}, {\"size\": 100, \"count\": 524288, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 181795.72699602958, \"throughput\": 181758.94298743442}, {\"size\": 215, \"count\": 524288, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1599087.4153351092, \"throughput\": 408223.4691862092}, {\"size\": 215, \"count\": 1048576, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1654483.4168393286, \"throughput\": 411851.1004101201}, {\"size\": 215, \"count\": 2097152, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1560885.0020993599, \"throughput\": 380871.6512509942}, {\"size\": 215, \"count\": 262144, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 181785.17838336073, \"throughput\": 181725.30452214953}, {\"size\": 215, \"count\": 524288, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 179191.15934866655, \"throughput\": 179161.63697074397}, {\"size\": 464, \"count\": 327680, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1448207.5088667995, \"throughput\": 379371.1487201747}, {\"size\": 464, \"count\": 655360, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1579125.655165121, \"throughput\": 384253.47032506694}, {\"size\": 464, \"count\": 1310720, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1562701.3313131528, \"throughput\": 384466.2172890232}, {\"size\": 464, \"count\": 262144, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 181365.92040415507, \"throughput\": 181289.75947361803}, {\"size\": 464, \"count\": 524288, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 179800.73159221103, \"throughput\": 179778.7017683385}, {\"size\": 1000, \"count\": 327680, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1428335.4358941857, \"throughput\": 330164.00373951375}, {\"size\": 1000, \"count\": 655360, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1507676.8035916092, \"throughput\": 307840.3458703539}, {\"size\": 1000, \"count\": 1310720, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 1359454.9791983308, \"throughput\": 257823.9301234016}, {\"size\": 1000, \"count\": 262144, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 172757.13689609861, \"throughput\": 172688.07411353884}, {\"size\": 1000, \"count\": 524288, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 181337.96276033064, \"throughput\": 181304.57403175774}, {\"size\": 2154, \"count\": 262144, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 678967.1268527976, \"throughput\": 233115.43952882718}, {\"size\": 2154, \"count\": 524288, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 711456.8250122858, \"throughput\": 232133.97155806594}, {\"size\": 2154, \"count\": 1048576, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 988131.8036590181, \"throughput\": 259662.81578780443}, {\"size\": 2154, \"count\": 262144, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 199419.770959639, \"throughput\": 196784.6402886265}, {\"size\": 2154, \"count\": 524288, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 203852.68684935448, \"throughput\": 203442.86286040884}, {\"size\": 2154, \"count\": 1048576, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 205328.8669662604, \"throughput\": 205179.1179920869}, {\"size\": 4641, \"count\": 131072, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 529405.1603990332, \"throughput\": 115303.36348625875}, {\"size\": 4641, \"count\": 262144, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 583285.1122878667, \"throughput\": 117434.51844281104}, {\"size\": 4641, \"count\": 524288, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 395802.8775625893, \"throughput\": 110084.274052579}, {\"size\": 4641, \"count\": 81920, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 213297.1772399376, \"throughput\": 87008.06733234416}, {\"size\": 4641, \"count\": 163840, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 211868.26866939594, \"throughput\": 88778.47162810937}, {\"size\": 4641, \"count\": 327680, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 212249.61622593182, \"throughput\": 89495.02159853339}, {\"size\": 10000, \"count\": 81920, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 257995.0564216572, \"throughput\": 54825.506765252365}, {\"size\": 10000, \"count\": 163840, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 262129.14755280566, \"throughput\": 57333.53026383853}, {\"size\": 10000, \"count\": 65536, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 191021.31386416883, \"throughput\": 41911.956992923675}, {\"size\": 10000, \"count\": 131072, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 202913.54479633324, \"throughput\": 42457.1939283853}, {\"size\": 21544, \"count\": 32768, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 177443.97477792783, \"throughput\": 29372.401261157018}, {\"size\": 21544, \"count\": 65536, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 177576.88862374763, \"throughput\": 29066.131658719238}, {\"size\": 21544, \"count\": 131072, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 138727.89901399627, \"throughput\": 29374.21970428845}, {\"size\": 21544, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 201025.35445775493, \"throughput\": 23078.87106639798}, {\"size\": 21544, \"count\": 65536, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 204718.24960548693, \"throughput\": 23491.03111833299}, {\"size\": 21544, \"count\": 131072, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 219725.5265540439, \"throughput\": 22828.4743396303}, {\"size\": 46415, \"count\": 16384, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 73601.14684100346, \"throughput\": 17508.546019734247}, {\"size\": 46415, \"count\": 32768, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 73170.2424126909, \"throughput\": 16716.53712493127}, {\"size\": 46415, \"count\": 65536, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 73457.89485233443, \"throughput\": 17210.069761187064}, {\"size\": 46415, \"count\": 16384, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 163516.1366693833, \"throughput\": 13875.610707367226}, {\"size\": 46415, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 202160.1635241577, \"throughput\": 13476.278044952267}, {\"size\": 46415, \"count\": 65536, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 185767.60181616654, \"throughput\": 13725.569449783094}, {\"size\": 100000, \"count\": 16384, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 37515.162855347306, \"throughput\": 10550.889027478865}, {\"size\": 100000, \"count\": 32768, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 32214.09813650009, \"throughput\": 9997.366700079323}, {\"size\": 100000, \"count\": 8192, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 114647.10056401056, \"throughput\": 8462.575771521568}, {\"size\": 100000, \"count\": 16384, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 154631.9751441088, \"throughput\": 8647.257957404554}, {\"size\": 100000, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 208749.72883021602, \"throughput\": 8858.596911463485}, {\"size\": 215443, \"count\": 8192, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 16738.023833102958, \"throughput\": 9313.586185548565}, {\"size\": 215443, \"count\": 16384, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 19483.36277076363, \"throughput\": 9184.819672681806}, {\"size\": 215443, \"count\": 32768, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 18003.597055837257, \"throughput\": 6645.669331530065}, {\"size\": 215443, \"count\": 8192, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 82190.20193261669, \"throughput\": 7339.747514674214}, {\"size\": 215443, \"count\": 16384, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 165385.59119625587, \"throughput\": 7915.104238343077}, {\"size\": 215443, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 185066.18663312757, \"throughput\": 8306.44331020505}, {\"size\": 464158, \"count\": 4096, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 7096.212538966304, \"throughput\": 4397.668936559528}, {\"size\": 464158, \"count\": 8192, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 8671.972978335563, \"throughput\": 4665.796338452306}, {\"size\": 464158, \"count\": 16384, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 8530.948098017829, \"throughput\": 3915.9767249391775}, {\"size\": 464158, \"count\": 4096, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 137095.60353203723, \"throughput\": 3529.974957731902}, {\"size\": 464158, \"count\": 8192, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 181307.79490280838, \"throughput\": 4271.370480037751}, {\"size\": 464158, \"count\": 16384, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 202969.26291651686, \"throughput\": 4046.187970636908}, {\"size\": 1000000, \"count\": 2048, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 3937.3478823282953, \"throughput\": 2538.5598691356004}, {\"size\": 1000000, \"count\": 4096, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 4160.694129576773, \"throughput\": 2243.1145033279076}, {\"size\": 1000000, \"count\": 8192, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 3813.57789607267, \"throughput\": 1803.3720943516792}, {\"size\": 1000000, \"count\": 2048, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 150076.8044424124, \"throughput\": 2265.7232330830307}, {\"size\": 1000000, \"count\": 4096, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 185898.86321212963, \"throughput\": 2437.5735146596458}, {\"size\": 1000000, \"count\": 8192, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 189853.6789445999, \"throughput\": 2280.3150747859627}, {\"size\": 14677, \"count\": 65536, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 140224.00279488915, \"throughput\": 38868.83514555154}, {\"size\": 14677, \"count\": 131072, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 166314.92639069958, \"throughput\": 42575.9342877824}, {\"size\": 14677, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 194429.81368534354, \"throughput\": 32239.65038720164}, {\"size\": 14677, \"count\": 65536, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 195196.68366741916, \"throughput\": 32029.404792125464}, {\"size\": 14677, \"count\": 131072, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 198742.20529483526, \"throughput\": 31894.0685621863}, {\"size\": 31622, \"count\": 32768, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 110073.81640556351, \"throughput\": 21580.20908476313}, {\"size\": 31622, \"count\": 65536, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 128626.20815280222, \"throughput\": 21754.46467333933}, {\"size\": 31622, \"count\": 16384, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 187310.40166528345, \"throughput\": 16970.0977069629}, {\"size\": 31622, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 168925.76262788655, \"throughput\": 16983.235320177104}, {\"size\": 31622, \"count\": 65536, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 205731.73136614377, \"throughput\": 16296.06108483408}, {\"size\": 68129, \"count\": 16384, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 43131.002345727065, \"throughput\": 9672.858448504083}, {\"size\": 68129, \"count\": 32768, \"copy\": true, \"poll\": false, \"transport\": \"ipc\", \"sends\": 49613.79570942057, \"throughput\": 11079.798170520911}, {\"size\": 68129, \"count\": 8192, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 118809.40905549402, \"throughput\": 8850.894532154185}, {\"size\": 68129, \"count\": 16384, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 139138.36883617393, \"throughput\": 8576.169643536781}, {\"size\": 68129, \"count\": 32768, \"copy\": false, \"poll\": false, \"transport\": \"ipc\", \"sends\": 181967.92491015125, \"throughput\": 8382.442698105926}]}}, {\"mode\": \"vega-lite\"});\n",
"</script>"
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chart = crossover(thr, \"throughput\")\n",
"chart.title = \"Throughput\"\n",
"chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compare the maximum throughput for small messages:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"zero-copy max msgs/sec: ~2.1e+05\n",
" copy max msgs/sec: ~4.2e+05\n"
]
}
],
"source": [
"zero_copy_max = thr.where(~thr[\"copy\"]).throughput.max()\n",
"copy_max = thr.where(thr[\"copy\"]).throughput.max()\n",
"print(f\"zero-copy max msgs/sec: ~{zero_copy_max:.1e}\")\n",
"print(f\" copy max msgs/sec: ~{copy_max:.1e}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So that's a ~5x penalty when sending 100B messages.\n",
"It's still 40k msgs/sec, which isn't catastrophic,\n",
"but if you want to send small messages as fast as possible,\n",
"you can get closer to 250-500k msgs/sec if you skip the zero-copy logic.\n",
"\n",
"We can see the relative gains of zero-copy by plotting zero-copy performance\n",
"normalized to message-copying performance"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'pd' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[6], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m chart \u001b[38;5;241m=\u001b[39m \u001b[43mrelative\u001b[49m\u001b[43m(\u001b[49m\u001b[43mthr\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mthroughput\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2\u001b[0m chart\u001b[38;5;241m.\u001b[39mtitle \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mZero-copy Throughput (relative)\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 3\u001b[0m chart\n",
"Cell \u001b[0;32mIn[1], line 24\u001b[0m, in \u001b[0;36mrelative\u001b[0;34m(data, column, yscale)\u001b[0m\n\u001b[1;32m 20\u001b[0m no_copy \u001b[38;5;241m=\u001b[39m data[\u001b[38;5;241m~\u001b[39mdata[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcopy\u001b[39m\u001b[38;5;124m\"\u001b[39m]]\n\u001b[1;32m 21\u001b[0m reference \u001b[38;5;241m=\u001b[39m copy_mean[no_copy[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msize\u001b[39m\u001b[38;5;124m\"\u001b[39m]]\n\u001b[1;32m 22\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m (\n\u001b[1;32m 23\u001b[0m alt\u001b[38;5;241m.\u001b[39mChart(\n\u001b[0;32m---> 24\u001b[0m \u001b[43mpd\u001b[49m\u001b[38;5;241m.\u001b[39mDataFrame(\n\u001b[1;32m 25\u001b[0m {\n\u001b[1;32m 26\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msize\u001b[39m\u001b[38;5;124m\"\u001b[39m: no_copy[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msize\u001b[39m\u001b[38;5;124m\"\u001b[39m],\n\u001b[1;32m 27\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mno-copy speedup\u001b[39m\u001b[38;5;124m\"\u001b[39m: no_copy[column] \u001b[38;5;241m/\u001b[39m reference\u001b[38;5;241m.\u001b[39marray,\n\u001b[1;32m 28\u001b[0m }\n\u001b[1;32m 29\u001b[0m )\n\u001b[1;32m 30\u001b[0m )\n\u001b[1;32m 31\u001b[0m \u001b[38;5;241m.\u001b[39mmark_point()\n\u001b[1;32m 32\u001b[0m \u001b[38;5;241m.\u001b[39mencode(\n\u001b[1;32m 33\u001b[0m x\u001b[38;5;241m=\u001b[39malt\u001b[38;5;241m.\u001b[39mX(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msize\u001b[39m\u001b[38;5;124m\"\u001b[39m, title\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msize (B)\u001b[39m\u001b[38;5;124m\"\u001b[39m)\u001b[38;5;241m.\u001b[39mscale(\u001b[38;5;28mtype\u001b[39m\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mlog\u001b[39m\u001b[38;5;124m\"\u001b[39m),\n\u001b[1;32m 34\u001b[0m y\u001b[38;5;241m=\u001b[39malt\u001b[38;5;241m.\u001b[39mY(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mno-copy speedup\u001b[39m\u001b[38;5;124m\"\u001b[39m, title\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m\"\u001b[39m)\u001b[38;5;241m.\u001b[39mscale(\u001b[38;5;28mtype\u001b[39m\u001b[38;5;241m=\u001b[39myscale),\n\u001b[1;32m 35\u001b[0m )\n\u001b[1;32m 36\u001b[0m )\n",
"\u001b[0;31mNameError\u001b[0m: name 'pd' is not defined"
]
}
],
"source": [
"chart = relative(thr, \"throughput\")\n",
"chart.title = \"Zero-copy Throughput (relative)\"\n",
"chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So that's ~5x penalty for using zero-copy on 100B messages\n",
"and a ~2x win for using zero-copy in ~500kB messages.\n",
"THe crossover where the cost balances the benefit is in the vicinity of ~64kB.\n",
"\n",
"This is why pyzmq 17 introduces the `zmq.COPY_THRESHOLD` behavior,\n",
"which sents a bound where `copy=False` can always be used,\n",
"and the zero-copy machinery will only be triggered for frames that are larger than this threshold.\n",
"The default for zmq.COPY_THRESHOLD in pyzmq-17.0 is 64kB,\n",
"based on these experiments."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Send-only throughput"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So far, we've only been measuring the time it takes to actually deliver all of those messages (total application throughput).\n",
"\n",
"One of the big wins for zero-copy in pyzmq is that the the local `send` action is much less expensive for large messages because there is no `memcpy` in the handoff to zmq.\n",
"Plotting only the time it takes to *send* messages shows a much bigger win,\n",
"but similar crossover point."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chart = crossover(thr, \"sends\")\n",
"chart.title = \"Messages sent/sec\"\n",
"chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Scaled plot, showing ratio of zero-copy to copy throughput performance:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chart = relative(thr, \"sends\", yscale=\"log\")\n",
"chart.title = \"Zero-copy sends/sec (relative speedup)\"\n",
"chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `socket.send` calls for ~1MB messages is ~20x faster with zero-copy than copy,\n",
"but it's also ~10x *slower* for very small messages.\n",
"\n",
"Taking that into perspective, the penalty for zero-copy is ~10 µs per send:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"copy_small = 1e6 / thr[thr[\"copy\"] * (thr[\"size\"] == thr[\"size\"].min())][\"sends\"].mean()\n",
"nocopy = 1e6 / thr[~thr[\"copy\"]][\"sends\"]\n",
"penalty = nocopy - copy_small\n",
"print(f\"Small copying send : {copy_small:.2f}µs\")\n",
"print(f\"Small zero-copy send: {nocopy.mean():.2f}µs ± {nocopy.std():.2f}µs\")\n",
"print(f\"Penalty : [{penalty.min():.2f}µs - {penalty.max():.2f}µs]\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"which is a pretty big deal for small sends that only take 2µs, but nothing for 1MB sends, where the memcpy can take almost a millisecond:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"copy_big = 1e6 / thr[thr[\"copy\"] * (thr[\"size\"] == thr[\"size\"].max())][\"sends\"].mean()\n",
"print(f\"Big copying send ({thr['size'].max() / 1e6:.0f} MB): {copy_big:.2f}µs\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Latency\n",
"\n",
"Latency tests measure REQ-REP request/reply cycles, waiting for a reply before sending the next request.\n",
"This more directly measures the cost of sending and receiving a single message,\n",
"removing any instance of queuing up multiple sends in the background.\n",
"\n",
"This differs from the throughput test, where many messages are in flight at once.\n",
"This is significant because much of the performance cost of zero-copy is in\n",
"contention between the garbage collection thread and the main thread.\n",
"If garbage collection events fire when the main thread is idle waiting for a message,\n",
"this has ~no extra cost."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open(\"lat.pickle\", \"rb\") as f:\n",
" lat = pickle.load(f)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chart = crossover(lat, \"latency\", ylabel=\"µs\")\n",
"chart.title = \"Latency (µs)\"\n",
"chart"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chart = relative(lat, \"latency\")\n",
"chart.title = \"Relative increase in latency zero-copy / copy\"\n",
"chart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the latency test, we see that there is much lower overhead to the zero-copy machinery when there are few messages in flight.\n",
"This is expected, because much of the performance cost comes from thread contention when the gc thread is working hard to keep up with the freeing of messages that zmq is done with.\n",
"\n",
"The result is a much lower penalty for zero-copy of small messages."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}