7 | | '''Symptom''': No new images are showing up on disk or in the obsbot display, and the real time display does not update. The control system and readout timer, |
8 | | however, had no indication that data wasn't being written. |
9 | | |
10 | | '''Frequency:''' This appeared as a new problem starting on 5/6 Feb 2016. |
11 | | |
12 | | '''Fix:''' |
13 | | - In a mosaic3 xterm window type |
14 | | {{{ |
15 | | touch ~observer/exec/mosbot/quit |
16 | | }}} |
17 | | - Wait for the exposure to complete and the observing script to stop |
18 | | - Stop NOCS using the red "Stop MOSAIC" button on the MOSAIC 3 MENU gui |
19 | | - Once NOCS has shutdown, find the blue xterm window which came up when NOCS was launched and type |
20 | | {{{ |
21 | | nocs nuke pana |
22 | | nocs nuke dhs |
23 | | nocs status all |
24 | | }}} |
25 | | - The output from nocs status all should show that all processes are STOPPED and only 3 pvm processes RUNNING |
26 | | - Restart NOCS using the blue "Start MOSAIC" button on the MOSAIC 3 MENU gui |
27 | | - Once the system is back up, take a ZERO image to make sure everything is working |
28 | | - When the image appears, in the IRAF window type |
29 | | {{{ |
30 | | mscstat <file name> |
31 | | }}} |
32 | | - If the results of the mscstat show the noise (in the Standard Deviation column) to be 4<rms<10, then ... |
33 | | - restart the observing script by following the instructions here |
34 | | |
35 | | == 1. Images appear on the real time display, but are not written immediately to the data directory == |
36 | | |
37 | | '''Symptom''': The real time display shows updated images, but files are very slow to appear in the data directory. The MOSSTAT and copilot displays are slow to update; copilot may eventually flag the image with a "readtime" failure. |
38 | | |
39 | | '''Frequency:''' Frequent as of early February 2016. |
40 | | |
41 | | '''Fix:''' |
42 | | - In the Data Handling System window in the mosaic3:1 VNC session |
43 | | {{{ |
44 | | - select the Shared Memory Cache tab |
45 | | - Click Process All |
46 | | - Click Update Status |
47 | | |
48 | | }}} |
49 | | - the image should then appear in the data directory. |
50 | | |
51 | | |
52 | | == 2. Detector timeouts == |
53 | | |
54 | | '''Symptom''': Count-down timer in NMSL GUI turns red keeps counting down negative numbers (to -105 or so), or hangs during exposure, and no new images are read. |
55 | | |
56 | | [[Image(nocs_timeout.png, 400px)]] |
57 | | |
58 | | '''Frequency''': Several times per night in the early Feb 2016 run. No timeouts on Feb 5, one on Feb 6. |
59 | | |
60 | | '''Fix:''' |
61 | | - Stop the observing script using "touch quit" |
62 | | - Try reading out the last image |
63 | | {{{ |
64 | | ditscmd nohs nohs_endobs |
65 | | }}} |
66 | | - Reinitialize the NMSL |
67 | | {{{ |
68 | | Open a new xterm window on mayall-2 |
69 | | ssh into mosaic3 by typing: ssh observer@mosaic3 |
70 | | Type: nmslReset |
71 | | }}} |
72 | | - (The above procedure should be executable from a regular xterm that comes up with NOCS, but for some reason the .alias |
73 | | file is not being sourced properly. Have to fix this. ) |
74 | | - Take a zero image, make sure it reads out, and check the image by running "mscstat" in the IRAF terminal. |
75 | | |
76 | | -If the zero image image times out, then check the ccp: |
77 | | {{{ |
78 | | nocs fullstatus ccp |
79 | | |
80 | | It should look something like this: |
81 | | "stuff here" |
82 | | |
83 | | }}} |
84 | | |
85 | | This will indicate if the problem is at the controller level. I.e., you should see something |
86 | | indicating the controllers are all ok, or one or more are giving errors. |
87 | | |
88 | | If fullstatus looks OK (not a controller problem), but the above ditscmd did not work, |
89 | | then just NUKE! |
90 | | |
91 | | {{{ |
92 | | Stop Mosaic |
| 7 | Symptom:: No new images are showing up on disk or in the obsbot display, and the real time display does not update. The control system and readout timer, however, had no indication that data wasn't being written. |
| 8 | Frequency:: This appeared as a new problem starting on 5/6 Feb 2016. |
| 9 | Fix:: |
| 10 | - In a mosaic3 xterm window type |
| 11 | {{{ |
| 12 | touch ~observer/exec/mosbot/quit |
| 13 | }}} |
| 14 | - Wait for the exposure to complete and the observing script to stop |
| 15 | - Stop NOCS using the red "Stop MOSAIC" button on the MOSAIC 3 MENU gui |
| 16 | - Once NOCS has shutdown, find the blue xterm window which came up when NOCS was launched and type |
| 17 | {{{ |
95 | | Start Mosaic |
96 | | }}} |
97 | | |
98 | | The nuke commands will clear issues with stale PAN processes. |
99 | | |
100 | | Take a zero and see if it looks ok by running mscstat. If good, restart the bot and keep going! |
101 | | |
102 | | There is (should be) no need to Stop/Start Cameras. However, if it doesn't look ok, then you may have to do a full restart. |
103 | | |
104 | | {{{ |
105 | | Stop Mosaic |
106 | | Stop Cameras |
107 | | Start Cameras |
108 | | Start Mosaic |
109 | | }}} |
110 | | |
111 | | Take a zero image, make sure it reads out, and check image by running "mscstat" If the zero image looks ok, you can restart the observing script. |
| 20 | nocs status all |
| 21 | }}} |
| 22 | - The output from nocs status all should show that all processes are STOPPED and only 3 pvm processes RUNNING |
| 23 | - Restart NOCS using the blue "Start MOSAIC" button on the MOSAIC 3 MENU gui |
| 24 | - Once the system is back up, take a ZERO image to make sure everything is working |
| 25 | - When the image appears, in the IRAF window type |
| 26 | {{{ |
| 27 | mscstat <file name> |
| 28 | }}} |
| 29 | - If the results of the mscstat show the noise (in the Standard Deviation column) to be 4<rms<10, then ... |
| 30 | - restart the observing script by following the instructions here |
| 31 | |
| 32 | == 1. Images appear on the real time display, but are not written immediately to the data directory == |
| 33 | |
| 34 | Symptom:: The real time display shows updated images, but files are very slow to appear in the data directory. The MOSSTAT and copilot displays are slow to update; copilot may eventually flag the image with a "readtime" failure. |
| 35 | Frequency:: Frequent as of early February 2016. |
| 36 | Fix:: |
| 37 | * In the Data Handling System window in the mosaic3:1 VNC session |
| 38 | - select the Shared Memory Cache tab |
| 39 | - Click Process All |
| 40 | - Click Update Status |
| 41 | * The image should then appear in the data directory. |
| 42 | |
| 43 | == 2. Detector timeouts == |
| 44 | |
| 45 | Symptom:: Count-down timer in NMSL GUI turns red keeps counting down negative numbers (to -105 or so), or hangs during exposure, and no new images are read.[[BR]] |
| 46 | [[Image(nocs_timeout.png, 400px)]] |
| 47 | Frequency:: Several times per night in the early Feb 2016 run. No timeouts on Feb 5, one on Feb 6. |
| 48 | Fix:: |
| 49 | - Stop the observing script using "touch quit" |
| 50 | - Try reading out the last image |
| 51 | {{{ |
| 52 | ditscmd nohs nohs_endobs |
| 53 | }}} |
| 54 | - Reinitialize the NMSL |
| 55 | * Open a new xterm window on mayall-2 |
| 56 | * ssh into mosaic3 by typing: `ssh observer@mosaic3` |
| 57 | * Type: `nmslReset` |
| 58 | - The above procedure should be executable from a regular xterm that comes up with NOCS, but for some reason the .alias file is not being sourced properly. Have to fix this. |
| 59 | - Take a zero image, make sure it reads out, and check the image by running "mscstat" in the IRAF terminal. |
| 60 | - If the zero image image times out, then check the ccp: |
| 61 | {{{ |
| 62 | nocs fullstatus ccp |
| 63 | }}} |
| 64 | It should look something like this: |
| 65 | {{{ |
| 66 | "stuff here" |
| 67 | }}} |
| 68 | This will indicate if the problem is at the controller level. That is, you should see something indicating the controllers are all ok, or one or more are giving errors. |
| 69 | - If fullstatus looks OK (not a controller problem), but the above ditscmd did not work, then just NUKE! |
| 70 | * Stop Mosaic |
| 71 | {{{ |
| 72 | nocs nuke pana |
| 73 | nocs nuke dhs |
| 74 | }}} |
| 75 | * Start Mosaic. The nuke commands will clear issues with stale PAN processes. |
| 76 | - Take a zero and see if it looks ok by running mscstat. If good, restart the bot and keep going! |
| 77 | - There is (should be) no need to Stop/Start Cameras. However, if it doesn't look ok, then you may have to do a full restart. |
| 78 | * Stop Mosaic |
| 79 | * Stop Cameras |
| 80 | * Start Cameras |
| 81 | * Start Mosaic |
| 82 | - Take a zero image, make sure it reads out, and check image by running "mscstat" If the zero image looks ok, you can restart the observing script. |
115 | | '''Symptom:''' One amp has extremely large noise, looks awful the display of the image with "mcsdisplay" with a noise > 1000 ADU/pix from the "mscstat" command. |
116 | | |
117 | | '''Frequency:''' Several times in Dec 2015 run, never in early Feb 2016 run. |
118 | | |
119 | | '''Fix:''' |
120 | | - Stop the observing script with cntrl-C |
121 | | - Reset the NOCS controller by typing the following in a NOCS xterm window: |
122 | | {{{ |
123 | | nocs reset ccp |
124 | | nocs init ccp |
125 | | }}} |
126 | | - Take two zero images. The first image is always bad. The second should be OK. |
127 | | Check it using "mscstat <filename>" once it has read out. All rms values should be between 4-7 ADU except for amp 6 with will be 8-10 ADU. |
128 | | - Re-start the observing script |
| 86 | Symptom:: One amp has extremely large noise, looks awful the display of the image with "mcsdisplay" with a noise > 1000 ADU/pix from the "mscstat" command. |
| 87 | Frequency:: Several times in Dec 2015 run, never in early Feb 2016 run. |
| 88 | Fix:: |
| 89 | - Stop the observing script with cntrl-C |
| 90 | - Reset the NOCS controller by typing the following in a NOCS xterm window: |
| 91 | {{{ |
| 92 | nocs reset ccp |
| 93 | nocs init ccp |
| 94 | }}} |
| 95 | - Take two zero images. The first image is always bad. The second should be OK. Check it using "mscstat <filename>" once it has read out. All rms values should be between 4-7 ADU except for amp 6 with will be 8-10 ADU. |
| 96 | - Re-start the observing script |