SIGBUS during WRF run, warnings during metgrid

Any issues with the actual running of the WRF.

SIGBUS during WRF run, warnings during metgrid

Postby msullivan-cornell » Sun May 14, 2017 1:26 pm

I am building WPS & WRF 3.8 from source into a docker container, more or less following the instructions outlined here. I've gotten this to work fine in the past, but I'm using a different base container.

I am using the gfortran, mpich, and dmpar config for WRF, and gfortran, serial config for WPS. I am configuring "real" model runs.

I get through geogrid, ungrib, metgrid, real just fine. I run wrf.exe in parallel and it runs fine for about 33% of the model run but then errors out with a SIGBUS with the following message:

Code: Select all
===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 7
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Bus error (signal 7)


I also get an interesting message when metgrid completes that (seems to me) is likely related to why the SIGBUS happens. In my initial days of computer science, I remember learning that bus errors can happen when addressing floating point numbers in a bad way:

Code: Select all
Note: The following floating-point exceptions are signalling: IEEE_OVERFLOW_FLAG IEEE_UNDERFLOW_FLAG IEEE_DENORMAL


So this leads me to believe I have to tweak my compilation config for WPS / WRF. Anyone have advice?
msullivan-cornell
 
Posts: 3
Joined: Sun May 14, 2017 1:05 pm

Re: SIGBUS during WRF run, warnings during metgrid

Postby msullivan-cornell » Mon May 15, 2017 8:18 pm

Here is an interesting development / clue... It seems the models complete fine if I don't use every core available on the system... i.e. if I use available CPU - 4 it finishes! I first tried CPU - 2, and it finished *sometimes* but not others. It never completes when I use all CPU's... Very weird!

I guess I can live with this, but anyone know why using all of the CPU's is causing this SIGBUS?
msullivan-cornell
 
Posts: 3
Joined: Sun May 14, 2017 1:05 pm

Re: SIGBUS during WRF run, warnings during metgrid

Postby msullivan-cornell » Tue May 16, 2017 8:10 am

OK, well I lied. There *might* be a correlation between the number of CPU's I using and this error, but these errors keep popping up.

I'd really appreciate any ideas. I have a couple more small things I am going to try. I realize I was also using WRF 3.7 previously as well, so I may try to recompile with that version instead of 3.8.1 to see if that makes a difference.
msullivan-cornell
 
Posts: 3
Joined: Sun May 14, 2017 1:05 pm

Re: SIGBUS during WRF run, warnings during metgrid

Postby etorresm » Wed Mar 07, 2018 4:51 pm

I had this problem but right now is running. My solution was

1 Delete all the files and I began again.
2 Launch with mpiexec -l -n 4 ./wrf.exe # 4 is for your number of processors.

Best wishes from Colombia
etorresm
 
Posts: 4
Joined: Wed Mar 07, 2018 4:47 pm

Re: SIGBUS during WRF run, warnings during metgrid

Postby twister87 » Sat May 26, 2018 3:23 am

@msullivan-cornell thanks for your post.

I am having the exact same issue with WRF3.9.1.1:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! Successful completion of metgrid. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Note: The following floating-point exceptions are signalling: IEEE_OVERFLOW_FLAG IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
RUNNING WRF
Fri 25 May 21:54:24 BST 2018
starting wrf task 0 of 1
[0] starting wrf task 0 of 4
[1] starting wrf task 1 of 4
[2] starting wrf task 2 of 4
[3] starting wrf task 3 of 4

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 1
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

I am running it from an Ubuntu 16.04 machine built into Virtualbox from Mac. I have got 4 processors available but I cannot run WRF anyway, i.e. with processors < 4.

I tried the solution offered by @etorresm but it does not work.

Any hint or suggestion on how to solve this?
Thanks
twister87
 
Posts: 2
Joined: Fri May 25, 2018 4:56 pm


Return to Runtime Problems

Who is online

Users browsing this forum: No registered users and 8 guests

cron