• 19.04.2024, 04:33
  • Registrieren
  • Anmelden
  • Sie sind nicht angemeldet.

 

Lieber Besucher, herzlich willkommen bei: Aqua Computer Forum. Falls dies Ihr erster Besuch auf dieser Seite ist, lesen Sie sich bitte die Hilfe durch. Dort wird Ihnen die Bedienung dieser Seite näher erläutert. Darüber hinaus sollten Sie sich registrieren, um alle Funktionen dieser Seite nutzen zu können. Benutzen Sie das Registrierungsformular, um sich zu registrieren oder informieren Sie sich ausführlich über den Registrierungsvorgang. Falls Sie sich bereits zu einem früheren Zeitpunkt registriert haben, können Sie sich hier anmelden.

Testing request

Donnerstag, 1. August 2013, 10:04

Hi folks,

I have committed initial support for reading device (sensor, fan, etc...) names to my devel branch of aerotools-ng and need some testing feedback to ensure the key aspects of how the device names are being retrieved is working properly for other Linux users. The method I was forced to use isn't ideal, and under some rare circumstances has the probability of failing. I would like to ensure that failures are not a legitimate concern by getting test feedback from other systems of varying performance levels.

If you have the time and inclination, please lend a hand by cloning my devel branch and giving it a try.

Notes: As of commit 016ff1753b2d40f39d74dfc762196bde9ad98b12 sensor names are printed in default full (aerocli -a -o default) output mode only, so please try this mode and report back on your findings.

Example of expected results from my system:

Quellcode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
----------- System -----------
Name          = 'aquaero 5'
Time (UTC)    = 2013-08-01 09:05:29
Time (local)  = 2013-08-01 01:05:29
Uptime        = 108d 11:35:28
Uptime total  = 1y 22d 20:11:57
Serial number = XXXXX-YYYYY
Firmware      = 1027
Bootloader    = 101
Hardware      = 5600
CPU Temp      = 41.28 °C (+0.00)
---- Temperature Sensors -----
Sensor  1 'Sensor 1t'   = 30.93 °C (+0.00)
Sensor  2 'Sensor 2'    = not connected
Sensor  3 'Sensor 3'    = not connected
Sensor  4 'Sensor 4'    = not connected
Sensor  5 'Sensor 5'    = not connected
Sensor  6 'Sensor 6'    = not connected
Sensor  7 'Sensor 7'    = not connected
Sensor  8 'Sensor 8'    = not connected
Sensor  9 'Unknown'     = not connected
Sensor 10 'Unknown'     = not connected
Sensor 11 'Unknown'     = not connected
Sensor 12 'Unknown'     = not connected
Sensor 13 'Unknown'     = not connected
Sensor 14 'Unknown'     = not connected
Sensor 15 'Unknown'     = not connected
Sensor 16 'Unknown'     = not connected
---- Virtual Sensors -----
Sensor  1 'Virtual sensor 1t'   = 32.86 °C (+0.00)
Sensor  2 'Virtual sensor 2t'   = not connected
Sensor  3 'Virtual sensor 3t'   = not connected
Sensor  4 'Virtual sensor 4t'   = not connected
---- Software Sensors -----
Sensor  1 'Software sensor 1t'  = 51.00 °C (+0.00)
Sensor  2 'Software sensor 2t'  = not connected
Sensor  3 'Software sensor 3t'  = not connected
Sensor  4 'Software sensor 4t'  = not connected
Sensor  5 'Software sensor 5t'  = not connected
Sensor  6 'Software sensor 6t'  = not connected
Sensor  7 'Software sensor 7t'  = not connected
Sensor  8 'Software sensor 8t'  = not connected
---- Other Sensors -----
Sensor  1     = not connected
Sensor  2     = not connected
Sensor  3     = not connected
Sensor  4     = not connected
Sensor  5     = not connected
Sensor  6     = not connected
Sensor  7     = not connected
Sensor  8     = not connected
Sensor  9     = not connected
Sensor 10     = not connected
Sensor 11     = not connected
Sensor 12     = not connected
Sensor 13     = not connected
Sensor 14     = not connected
Sensor 15     = not connected
Sensor 16     = not connected
------------ Fans ------------
Fan  1 'Fan 1t':        3916rpm @  99% 11.87 V
'Fan amplifier 1t'       155 mA  34.80 °C
Fan  2 'Fan 2t':           0rpm @   0%  0.03 V
'Fan amplifier 2t'         0 mA  34.80 °C
Fan  3 'Fan 3t':           0rpm @   0%  0.02 V
'Fan amplifier 3t'         0 mA  34.62 °C
Fan  4 'Fan 4t':           0rpm @  99% 11.80 V
'Fan amplifier 4t'         0 mA  34.62 °C
Fan  5 'Fan 5t':           0rpm @  99% 12.21 V
'Fan amplifier 5t'         0 mA  51.32 °C
Fan  6 'Fan 6t':           0rpm @  99% 12.21 V
'Fan amplifier 6'          0 mA  52.42 °C
Fan  7 'Fan 7t':           0rpm @  99% 12.19 V
'Fan amplifier 7'          0 mA  50.22 °C
Fan  8 'Fan 8':         not connected
Fan  9 'Fan 9':         not connected
Fan 10 'Fan 10':        not connected
Fan 11 'Fan 11':        not connected
Fan 12 'Fan 12':        not connected
-------- Flow Sensors --------
Flow  1 'Flow 1t':       0 l/h
Flow  2 'Flow 2t':       0 l/h
Flow  3 'Flow 3t':       0 l/h
Flow  4 'Flow 4t':       0 l/h
Flow  5 'Flow 5t':       0 l/h
Flow  6 'Flow 6':        0 l/h
Flow  7 'Flow 7':        0 l/h
Flow  8 'Flow 8':        0 l/h
Flow  9 'Flow 9':        0 l/h
Flow 10 'Flow 10':       0 l/h
Flow 11 'Flow 11':       0 l/h
Flow 12 'Flow 12':       0 l/h
Flow 13 'Flow 13':       0 l/h
Flow 14 'Flow 14':       0 l/h
------- Liquid Levels --------
Level  1 'Fill level 1':         0%
Level  2 'Fill level 2':         0%
Level  3 'Fill level 3':         0%
Level  4 'Fill level 4':         0%
-------- Aquastreams ---------
Aquastream  1:  not connected
Aquastream  2:  not connected


Thanks in advance for your help! :D

Dieser Beitrag wurde bereits 1 mal editiert, zuletzt von »JinTu« (1. August 2013, 10:17)

Donnerstag, 1. August 2013, 23:30

works fine, here are the outputs. names are correct
»Raptor 2101« hat folgende Datei angehängt:
  • output.txt (11,54 kB - 426 mal heruntergeladen - zuletzt: 12. April 2024, 12:28)

Freitag, 2. August 2013, 01:53

works fine, here are the outputs. names are correct

Thanks Raptor 2101!

Since you have Aquastreams: can you tell me if they are the original Aquastream or Aquastream XT? The Ae5 makes a distinction between the two and I want to make sure I am using the appropriate values/names.

Also, if you wouldn't mind, I would appreciate it if you could run the following script and letting me know what it says at the end of execution:

Quellcode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/bin/bash

date1=$(date +"%s")
rm test-output.txt

for i in {1..1000}
do
  echo "Iteration $i"
  echo "Iteration $i" >>test-output.txt
  ../aerotools-ng/bin/aerocli >>test-output.txt 2>&1
  sleep 5
done

date2=$(date +"%s")
diff=$(($date2-$date1))

echo "Test completed in $(($diff / 60)) minutes and $(($diff % 60)) seconds"

FAILURES=$(grep "failed" test-output.txt |wc -l)
PERCENT_FAILED=$(echo "scale=4; ($FAILURES / $i) * 100" | bc -q 2>/dev/null)
echo "Encountered $FAILURES failures during test ($PERCENT_FAILED%)"


You may need to tweak the path to aerocli as appropriate for your system. The script runs aerocli 1000 times and counts failures. it will likely take between 5-10 minutes to complete execution depending on the number of failures and your system performance.

The output should look something like this:

Quellcode

1
2
3
4
5
6
7
...
Iteration 997
Iteration 998
Iteration 999
Iteration 1000
Test completed in 6 minutes and 28 seconds
Encountered 2 failures during test (.2000%)

The part I am interested in is just the last two lines

Edit: Added sleep to loop so as to not thrash the USB subsystem

Dieser Beitrag wurde bereits 3 mal editiert, zuletzt von »JinTu« (7. August 2013, 08:56)

Freitag, 2. August 2013, 21:35

Test completed in 14 minutes and 56 seconds
Encountered 505 failures during test (50.5000%)


Testsystem: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (Quad Core) 32 GB Ram/64Bit Ubuntu

I have attached two AquaStream XT Standard.

Samstag, 3. August 2013, 00:22

Test completed in 14 minutes and 56 seconds
Encountered 505 failures during test (50.5000%)


Testsystem: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (Quad Core) 32 GB Ram/64Bit Ubuntu

I have attached two AquaStream XT Standard.


Thanks for the test report and additional info!

Unfortunately with a failure rate like that, it looks like I have some more work to do... Which kernel version are you currently running? I have seen some additional oddities with the hiddev version that is part of kernel 3.2 and wonder if there are some additional avenues to explore in that area...

Samstag, 3. August 2013, 01:45

Kernelversion: 3.2.0-52-generic

Ready for another round of testing

Mittwoch, 7. August 2013, 00:57

Hi folks,

I am ready for another round of testing with the latest commit from my devel branch of aerotools-ng. I finally found a method that allows reading the 8x HID report 12s that contain the device names "the right way" (full details are in commit 4434bd86ed7aa3ec4fcb7e56605c5d2fb9470a41). With the latest changes I can no longer get the read operation to fail and hope others have the same experience.

Note: I have updated the test script in post 23 to be slightly less aggressive by inserting a 5 second delay at the end of each test iteration. This should alleviate the USB hang/crashes reported by some testers.

I have also updated all the references from Aquastream to Aquastream XT, as they have a different set of device names as had been used in the previous commit.

Thanks in advance for your help!

Dieser Beitrag wurde bereits 1 mal editiert, zuletzt von »JinTu« (7. August 2013, 08:56)

Mittwoch, 7. August 2013, 23:09

with you original script i get the following:

Quellcode

1
2
3
4
5
Test completed in 2 minutes and 0 seconds
Encountered 0 failures during test (0%)

uname -r
3.8.0-28-generic


works very well ;)

Donnerstag, 8. August 2013, 02:40

with you original script i get the following:

Quellcode

1
2
3
4
5
Test completed in 2 minutes and 0 seconds
Encountered 0 failures during test (0%)

uname -r
3.8.0-28-generic


works very well ;)


Great!

Donnerstag, 8. August 2013, 14:09

i encounter some problems while running the you new code on my productive machine. While using the munin plugin i have to remove the "-d /dev/usb/hiddev0" statement cause this leads to a "bad file descriptor"-Error. using the cli without this statement all works well...

Donnerstag, 8. August 2013, 16:57

i encounter some problems while running the you new code on my productive machine. While using the munin plugin i have to remove the "-d /dev/usb/hiddev0" statement cause this leads to a "bad file descriptor"-Error. using the cli without this statement all works well...


I will look into it.

Donnerstag, 8. August 2013, 22:03

I have the output of three different systems (each with its on AE5) for you - good thinks first: independent from the CPU (lowest INTEL ATOM 510) the timings are nearly the same (around 2 minutes)

the results are here:test-output.txt.zip

Freitag, 9. August 2013, 08:04

i encounter some problems while running the you new code on my productive machine. While using the munin plugin i have to remove the "-d /dev/usb/hiddev0" statement cause this leads to a "bad file descriptor"-Error. using the cli without this statement all works well...


I will look into it.


This is now fixed as of commit 4ac7f2c57128edadcf241f6602cc6078eadc7f1f

Samstag, 10. August 2013, 00:25

ok deployed the new version correctly...

with the old version i encoutered a situation (long time test, 3 requested every 5 minutes) where the AE5 was unable to be queried by the aerocli (it simply hangs and got sombied) i have to do a soft-reset via

Quellcode

1
2
echo 0 > /sys/bus/usb/devices/3-1/authorized
echo 1 > /sys/bus/usb/devices/3-1/authorized
to get access to the AE5 again...

Samstag, 10. August 2013, 04:22

ok deployed the new version correctly...

with the old version i encoutered a situation (long time test, 3 requested every 5 minutes) where the AE5 was unable to be queried by the aerocli (it simply hangs and got sombied) i have to do a soft-reset via

Quellcode

1
2
echo 0 > /sys/bus/usb/devices/3-1/authorized
echo 1 > /sys/bus/usb/devices/3-1/authorized
to get access to the AE5 again...


I have seen something similar that occurs on both 2.6 and 3.2 kernels and added some code in the last commit that was intended to help detect when it occurs. Unfortunately it doesn't seem to work. What I see happening is that read() operations on the hiddev device just block indefinitely as though there was no data being sent on the bus. What I added in the last commit was an initial select() that should be able to detect if data is available for reading before starting to read, and timing out if it is not. Unfortunately the select() call succeeds even in this situation so there is more work to be done to figure it out. I did a bit of searching and couldn't find any references to similar problems being reported on other devices/systems, so we may need to blaze our own trail again....

Thanks for the info on /sys/bus/usb/devices/.../authorized. I had simply been unplugging/re-plugging the Ae5 to correct this, but detecting/automating the recovery would be best. It may be time to dust off my earlier experiments with udev to see if we can automate the reset when it occurs again.

Samstag, 10. August 2013, 13:15

I think support from one of the unofficial aquacomputer-staff would be helpfull ;)

Sonntag, 11. August 2013, 22:53

ok with you new version, the "hang to zomby" happens quite often. While running the old version for a weak (through munin) the new version got two hangs at this weakend...

Duration:

Dieser Beitrag wurde bereits 1 mal editiert, zuletzt von »Raptor 2101« (11. August 2013, 23:05)

Montag, 12. August 2013, 09:44

ok with you new version, the "hang to zomby" happens quite often. While running the old version for a weak (through munin) the new version got two hangs at this weakend...

Duration: [attach]4278[/attach]


Thanks for the heads-up. I am going to have a go at removing the read step (which is just a loop looking for new reports) and only use the ioctls instead since they seem to be consistent and reliable.

Dienstag, 13. August 2013, 02:11

ok with you new version, the "hang to zomby" happens quite often. While running the old version for a weak (through munin) the new version got two hangs at this weakend...

Duration: [attach]4278[/attach]


Thanks for the heads-up. I am going to have a go at removing the read step (which is just a loop looking for new reports) and only use the ioctls instead since they seem to be consistent and reliable.


I have prototyped the alternate approach in my sandbox (commit a104b34be9290dee691529ce6dee08a8ff87fcc8). I need to give it some more testing though since I did manage to hang my devel box a few times as I was tweaking things. I must admit, this is the most fragility I have seen with stuff running in user space on Linux for a long time and I have been a Linux hacker since kernel 0.99pl8...

Dienstag, 13. August 2013, 11:31


I have prototyped the alternate approach in my sandbox (commit a104b34be9290dee691529ce6dee08a8ff87fcc8). I need to give it some more testing though since I did manage to hang my devel box a few times as I was tweaking things. I must admit, this is the most fragility I have seen with stuff running in user space on Linux for a long time and I have been a Linux hacker since kernel 0.99pl8...


did you freeze youre whole machine or only the AE5-USB-Interface? (just to determine if i deploy your sandbox on my productive machine... ;))

Ähnliche Themen