Benchmark for conception

From CatchChallenger wiki
Jump to: navigation, search

The Qt benchmark have been do by gcc 4.5, Qt 4.8, Qt 5.0 alpha, and this benchmark application: https://github.com/alphaonex86/QtSignalsSlotsBenchmark

Contents

Connection system

Connection by seconds

Greater is better

Thread: 1, previous connection: 10000 Thread: 8, previous connection: 10000 Thread: 64, previous connection: 10000 Thread: 64, previous connection: 1000 Thread: 64, previous connection: 100
Qt4 with 2 slots + 2 signals 526316 526316 526316 500000 526316
Qt4 with 1000 slots + 1000 signals 31645 31645 31545 31746 31446
Qt5 (old syntax) 476190 500000 500000 500000 500000
Qt5 (new syntax) with 2 slots + 2 signals 909091 909091 1000000 1000000 909091
Qt5 (new syntax) with 1000 slots + 1000 signals 500000 500000 500000 500000 500000

Time to disconnect

With arguments

Lower is better, the time is in second. Try disconnect(sender,SIGNAL(),receiver,SLOT());

Thread: 1, previous connection: 10000 Thread: 8, previous connection: 10000 Thread: 64, previous connection: 10000 Thread: 64, previous connection: 100000 Thread: 64, previous connection: 1000000 Thread: 64, previous connection: 1000 Thread: 64, previous connection: 100
Qt4 1.35 1.375 1.369 11.6 103 0.888 0.817
Qt5 (old syntax) 1.469 1.495 1.51 12.5 117 0.993 0.933
Qt5 (new syntax) 1.653 1.573 1.568 19.5 212 0.993 0.93

Is this case with catchchallenger with just 1000 client (each client need ~ 1000 connection inter-thread), your server is out, it use 117s to disconnect a client (and potential free during this time).

Without arguments

Lower is better, the time is in second. this->disconnect();

Where 10000000 connections already setup Where 30000000 connections already setup 10000000 connections for other objects
Qt5 (new syntax) 0.205 0.583 0

Is this case with catchchallenger with just 1000 client (each client need ~ 1000 connection inter-thread), your server is out, it use 117s to disconnect a client (and potential free during this time). Not depands of number of signals/slots defined into the class with Qt5.

Messages

The signals/slots are connected by: Qt::QueuedConnection like for threaded usage.

Messages send

Greater is better

Thread: 1, previous connection: 10000 Thread: 8, previous connection: 10000 Thread: 64, previous connection: 10000 Thread: 64, previous connection: 1000 Thread: 64, previous connection: 100
Qt4 169870 171612 172446 175223 174128
Qt5 275285 279835 277826 277117 274924

Here we can see +60% of performance when we use Qt5. Any change in any version of Qt if have 2 or 1000 signals/slots declared on the class.

I count 3 signals passed to reply (QTcpSocket -> ClientRead, ClientReadSocket -> Thread to parse, Thread to parse -> ClientWriteSocket). In worse case I'm to 169870/3 ~= 56000. With 10 000 client connected, that's mean 5.6 signals/s by player. With 200ms of foot step (the worst case), that's mean -5 signals/s, then remaining 0.6 req/s. In the worst case, with normal pc, and large number of player, with the worst version of Qt.

Cpu usage

The value is in ms, lower is better

Thread: 1, previous connection: 10000 Thread: 8, previous connection: 10000 Thread: 64, previous connection: 10000 Thread: 64, previous connection: 1000 Thread: 64, previous connection: 100
Qt4 1050 1030 910 810 970
Qt5 910 830 810 870 880

System cpu usage

The value is in ms, lower is better

Thread: 1, previous connection: 10000 Thread: 8, previous connection: 10000 Thread: 64, previous connection: 10000 Thread: 64, previous connection: 1000 Thread: 64, previous connection: 100
Qt4 280 250 300 350 320
Qt5 200 260 270 230 210

Qt5 seam have performance, but due to fluctuation of the benchmark, is better of not trust on it.

Container

To store the player pointer, with full look up

Store 65535 pointer.

  • Insert the current player, into the 65534 list.
  • List the 65535 player to send the broadcast message.
  • Remove the current player of the 65535 list.
Insert List Remove
QList 0 1 1
QSet 0 6 6

String

Lower is better (https://github.com/alphaonex86/CatchChallenger/tree/master/tools/benchmarkstring), done with gcc 4.7, debug Qt 5.2 with c++11, format:

  • QStringLiteral, where QStringLiteral("string")
  • QLatin1String, where QLatin1String("string")
  • QString, where QString("string")
  • QLatin1Literal, where QLatin1Literal("string")
  • Char*, where "string"
  • Prepared, where use directly QString variable

Test:

  • Condition with != and ==: test into condition
  • QDomElement::hasAttribute(): test as function arguements
  • concat by +: concat
  • replace format to QString: var.replace(FORMAT ABOVE,varQString);
  • replace format to format: var.replace(FORMAT ABOVE,FORMAT ABOVE);
  • search replace string to format: varQString(regex,FORMAT ABOVE);
  • search replace format to string: FORMAT ABOVE.replace(regex,varQString);
Condition with != and == QDomElement::hasAttribute() concat by + Replace format to QString Replace format to format Search replace string to format Search replace format to string
QStringLiteral 3936ms 1947ms 4161ms 2957ms 5140ms 11986ms 11877ms
QLatin1String 204ms 1507ms 3771ms 1559ms 2142ms 11194ms NA
QString 4586ms 2280ms 4589ms 3288ms 5741ms 12310ms 12145ms
QLatin1Literal 231ms 1505ms 3775ms 1555ms 2130ms 11207ms NA
Char* 4630ms 2339ms 4275ms 3364ms 5811ms 12291ms NA
Prepared 92ms 31ms 2492ms 1280ms 237ms 9472ms 9313ms

Using prepared string, the loading datapack of the datapack (mostly Xml parsing time + file access time) 1560ms -> 742ms (https://github.com/alphaonex86/CatchChallenger/commit/4eeece9d5169d125fcfdbc2126629cdd2a75fe1c).

Send to all object

  • Case 1: 600x
    connect(timer to object)
    to send to all object
  • Case 2: connect(timer to object with QList), into object with QList do:
int index=0;
while(index<objectlist.size())
{
    objectlist.at(index)->call();
    index++;
}
 % of Cpu
Case 1 21%
Case 2 2,5%

That's mean: 8,4x performance improvement for only 600 objet... imagine with 65535 object (then player) as I plan...

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox