Changes between Version 65 and Version 66 of PublicPages/MayallZbandLegacy/NotesforObservers/Problems


Ignore:
Timestamp:
Jan 30, 2017 2:08:49 PM (8 years ago)
Author:
Benjamin Alan Weaver
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PublicPages/MayallZbandLegacy/NotesforObservers/Problems

    v65 v66  
    55== 0. No images being written ==
    66
    7 '''Symptom''': No new images are showing up on disk or in the obsbot display, and the real time display does not update.  The control system and readout timer,
    8 however, had no indication that data wasn't being written.
    9 
    10 '''Frequency:''' This appeared as a new problem starting on 5/6 Feb 2016.
    11 
    12 '''Fix:'''
    13    - In a mosaic3 xterm window type
    14 {{{
    15       touch ~observer/exec/mosbot/quit
    16 }}}
    17    - Wait for the exposure to complete and the observing script to stop
    18    - Stop NOCS using the red "Stop MOSAIC" button on the MOSAIC 3 MENU gui
    19    - Once NOCS has shutdown, find the blue xterm window which came up when NOCS was launched and type
    20 {{{
    21        nocs nuke pana
    22        nocs nuke dhs
    23        nocs status all
    24 }}}
    25    - The output from nocs status all should show that all processes are STOPPED and only 3 pvm processes RUNNING
    26    - Restart NOCS using the blue "Start MOSAIC" button on the MOSAIC 3 MENU gui
    27    - Once the system is back up, take a ZERO image to make sure everything is working
    28    - When the image appears, in the IRAF window type
    29 {{{
    30        mscstat <file name>
    31 }}}
    32    - If the results of the mscstat show the noise (in the Standard Deviation column) to be 4<rms<10, then ...
    33    - restart the observing script by following the instructions here
    34 
    35 == 1. Images appear on the real time display, but are not written immediately to the data directory ==
    36 
    37 '''Symptom''': The real time display shows updated images, but files are very slow to appear in the data directory.  The MOSSTAT and copilot displays are slow to update; copilot may eventually flag the image with a "readtime" failure.
    38 
    39 '''Frequency:''' Frequent as of early February 2016.
    40 
    41 '''Fix:'''
    42    - In the Data Handling System window in the mosaic3:1 VNC session
    43 {{{
    44       - select the Shared Memory Cache tab
    45       - Click Process All
    46       - Click Update Status
    47 
    48 }}}
    49    - the image should then appear in the data directory. 
    50 
    51 
    52 == 2. Detector timeouts ==
    53 
    54 '''Symptom''': Count-down timer in NMSL GUI turns red keeps counting down negative numbers (to -105 or so), or hangs during exposure, and no new images are read.
    55 
    56 [[Image(nocs_timeout.png, 400px)]]
    57 
    58 '''Frequency''': Several times per night in the early Feb 2016 run. No timeouts on Feb 5, one on Feb 6.
    59 
    60 '''Fix:'''
    61 - Stop the observing script using "touch quit"
    62 - Try reading out the last image
    63 {{{
    64       ditscmd nohs nohs_endobs
    65 }}}
    66 - Reinitialize the NMSL
    67 {{{
    68        Open a new xterm window on mayall-2
    69        ssh into mosaic3 by typing: ssh observer@mosaic3
    70        Type: nmslReset
    71 }}}
    72 - (The above procedure should be executable from a regular xterm that comes up with NOCS, but for some reason the .alias
    73 file is not being sourced properly. Have to fix this. )
    74 - Take a zero image, make sure it reads out, and check the image by running "mscstat" in the IRAF terminal.
    75 
    76 -If the zero image image times out, then check the ccp:
    77 {{{
    78      nocs fullstatus ccp
    79 
    80      It should look something like this:
    81      "stuff here"
    82 
    83 }}}
    84 
    85 This will indicate if the problem is at the controller level. I.e., you should see something
    86 indicating the controllers are all ok, or one or more are giving errors.
    87 
    88 If fullstatus looks OK (not a controller problem), but the above ditscmd did not work,
    89 then just NUKE!
    90 
    91 {{{
    92 Stop Mosaic
     7 Symptom:: No new images are showing up on disk or in the obsbot display, and the real time display does not update.  The control system and readout timer, however, had no indication that data wasn't being written.
     8 Frequency:: This appeared as a new problem starting on 5/6 Feb 2016.
     9 Fix::
     10       - In a mosaic3 xterm window type
     11         {{{
     12touch ~observer/exec/mosbot/quit
     13         }}}
     14       - Wait for the exposure to complete and the observing script to stop
     15       - Stop NOCS using the red "Stop MOSAIC" button on the MOSAIC 3 MENU gui
     16       - Once NOCS has shutdown, find the blue xterm window which came up when NOCS was launched and type
     17         {{{
    9318nocs nuke pana
    9419nocs nuke dhs
    95 Start Mosaic
    96 }}}
    97 
    98 The nuke commands will clear issues with stale PAN processes.
    99 
    100 Take a zero and see if it looks ok by running mscstat. If good, restart the bot and keep going!
    101 
    102 There is (should be) no need to Stop/Start Cameras. However, if it doesn't look ok, then you may have to do a full restart.
    103 
    104 {{{
    105 Stop Mosaic
    106 Stop Cameras
    107 Start Cameras
    108 Start Mosaic
    109 }}}
    110 
    111 Take a zero image, make sure it reads out, and check image by running "mscstat" If the zero image looks ok, you can restart the observing script.
     20nocs status all
     21         }}}
     22       - The output from nocs status all should show that all processes are STOPPED and only 3 pvm processes RUNNING
     23       - Restart NOCS using the blue "Start MOSAIC" button on the MOSAIC 3 MENU gui
     24       - Once the system is back up, take a ZERO image to make sure everything is working
     25       - When the image appears, in the IRAF window type
     26         {{{
     27mscstat <file name>
     28         }}}
     29       - If the results of the mscstat show the noise (in the Standard Deviation column) to be 4<rms<10, then ...
     30       - restart the observing script by following the instructions here
     31
     32== 1. Images appear on the real time display, but are not written immediately to the data directory ==
     33
     34 Symptom:: The real time display shows updated images, but files are very slow to appear in the data directory.  The MOSSTAT and copilot displays are slow to update; copilot may eventually flag the image with a "readtime" failure.
     35 Frequency:: Frequent as of early February 2016.
     36 Fix::
     37       * In the Data Handling System window in the mosaic3:1 VNC session
     38         - select the Shared Memory Cache tab
     39         - Click Process All
     40         - Click Update Status
     41       * The image should then appear in the data directory. 
     42
     43== 2. Detector timeouts ==
     44
     45 Symptom:: Count-down timer in NMSL GUI turns red keeps counting down negative numbers (to -105 or so), or hangs during exposure, and no new images are read.[[BR]]
     46           [[Image(nocs_timeout.png, 400px)]]
     47 Frequency:: Several times per night in the early Feb 2016 run. No timeouts on Feb 5, one on Feb 6.
     48 Fix::
     49       - Stop the observing script using "touch quit"
     50       - Try reading out the last image
     51         {{{
     52ditscmd nohs nohs_endobs
     53         }}}
     54       - Reinitialize the NMSL
     55         * Open a new xterm window on mayall-2
     56         * ssh into mosaic3 by typing: `ssh observer@mosaic3`
     57         * Type: `nmslReset`
     58      - The above procedure should be executable from a regular xterm that comes up with NOCS, but for some reason the .alias file is not being sourced properly. Have to fix this.
     59      - Take a zero image, make sure it reads out, and check the image by running "mscstat" in the IRAF terminal.
     60      - If the zero image image times out, then check the ccp:
     61        {{{
     62nocs fullstatus ccp
     63        }}}
     64        It should look something like this:
     65        {{{
     66"stuff here"
     67        }}}
     68        This will indicate if the problem is at the controller level. That is, you should see something indicating the controllers are all ok, or one or more are giving errors.
     69      - If fullstatus looks OK (not a controller problem), but the above ditscmd did not work,  then just NUKE!
     70        * Stop Mosaic
     71          {{{
     72nocs nuke pana
     73nocs nuke dhs
     74          }}}
     75        * Start Mosaic. The nuke commands will clear issues with stale PAN processes.
     76      - Take a zero and see if it looks ok by running mscstat. If good, restart the bot and keep going!
     77      - There is (should be) no need to Stop/Start Cameras. However, if it doesn't look ok, then you may have to do a full restart.
     78        * Stop Mosaic
     79        * Stop Cameras
     80        * Start Cameras
     81        * Start Mosaic
     82      - Take a zero image, make sure it reads out, and check image by running "mscstat" If the zero image looks ok, you can restart the observing script.
    11283
    11384== 3. Images show bad amp ==
    11485
    115 '''Symptom:''' One amp has extremely large noise, looks awful the display of the image with "mcsdisplay" with a noise > 1000 ADU/pix from the "mscstat" command.
    116 
    117 '''Frequency:''' Several times in Dec 2015 run, never in early Feb 2016 run.
    118 
    119 '''Fix:'''
    120 - Stop the observing script with cntrl-C
    121 - Reset the NOCS controller by typing the following in a NOCS xterm window:
    122 {{{
    123         nocs reset ccp
    124         nocs init ccp
    125 }}}
    126 - Take two zero images. The first image is always bad. The second should be OK.
    127    Check it using  "mscstat <filename>" once it has read out. All rms values should be between 4-7 ADU except for amp 6 with will be 8-10 ADU.
    128 - Re-start the observing script
     86 Symptom:: One amp has extremely large noise, looks awful the display of the image with "mcsdisplay" with a noise > 1000 ADU/pix from the "mscstat" command.
     87 Frequency:: Several times in Dec 2015 run, never in early Feb 2016 run.
     88 Fix::
     89       - Stop the observing script with cntrl-C
     90       - Reset the NOCS controller by typing the following in a NOCS xterm window:
     91         {{{
     92nocs reset ccp
     93nocs init ccp
     94         }}}
     95       - Take two zero images. The first image is always bad. The second should be OK. Check it using  "mscstat <filename>" once it has read out. All rms values should be between 4-7 ADU except for amp 6 with will be 8-10 ADU.
     96       - Re-start the observing script
    12997
    13098== 4. 4MAPS primary mirror support  goes “OFF AIR” during observing ==