News:

Latest versions:
Server plugin: 0.5.1
MVP dongle: 0.5.2
Raspberry Pi client: 0.5.2
Windows client: 0.5.2-1

Main Menu

Progress bar "lost" after skipping about 3 times

Started by laz, October 31, 2012, 09:20:16

Previous topic - Next topic

davep

Here's the instrumented code:
                Log::getInstance()->log("Audio", Log::DEBUG, "deallocatecodecsomx mark 5");
                 video->LockClock();
Log::getInstance()->log("Audio", Log::DEBUG, "**DP0**");
                if (!video->ChangeComponentState(omx_aud_rend,OMX_StateIdle)) {
                        Log::getInstance()->log("Audio", Log::DEBUG, "aud_rend ChangeComponentState");
                }
Log::getInstance()->log("Audio", Log::DEBUG, "**DP1**");
        // TODO proper deinit sequence
                // first flush all buffers

                error=OMX_SendCommand(omx_aud_rend,OMX_CommandFlush, omx_rend_input_port, NULL);
Log::getInstance()->log("Audio", Log::DEBUG, "**DP2**");
                if (error!=OMX_ErrorNone) {
                        Log::getInstance()->log("Audio", Log::DEBUG, "OMX_Flush rend in failed %x", error);

                }

        /*      error=OMX_SendCommand(omx_aud_dec,OMX_CommandFlush, omx_codec_input_port, NULL);
                if (error!=OMX_ErrorNone){
                        Log::getInstance()->log("Audio", Log::DEBUG, "OMX_Flush codec out failed %x", error);

                }*/


        /*      if (!video->CommandFinished(omx_aud_dec,OMX_CommandFlush,omx_codec_input_port)) {
                        Log::getInstance()->log("Audio", Log::DEBUG, "flush cmd codec input failed");
                }*/



                error=OMX_SendCommand(omx_clock,OMX_CommandFlush, omx_clock_output_port, NULL);
Log::getInstance()->log("Audio", Log::DEBUG, "**DP3**");
                if (error!=OMX_ErrorNone){
                        Log::getInstance()->log("Audio", Log::DEBUG, "OMX_Flush clock out failed %x", error);

                }

                error=OMX_SendCommand(omx_aud_rend,OMX_CommandFlush, omx_rend_clock_port, NULL);
Log::getInstance()->log("Audio", Log::DEBUG, "**DP4**");
                if (error!=OMX_ErrorNone){
                        Log::getInstance()->log("Audio", Log::DEBUG, "OMX_Flush rend clock failed %x", error);

                }
                Log::getInstance()->log("Audio", Log::DEBUG, "deallocatecodecsomx mark 6");


and the hang occurs after DP0 but before DP1. The "aud_rend ChangeComponentState" log entry does not appear either.

MartenR

Thanks, can you nail it down a big further and go into videoomx changecomponentstate method and see where it hangs there.

I suspect the OMX_ call there...  (If I am right please try my advise about comment out some code a few posts above).

Marten

davep

The hang is in the line

        error=OMX_SendCommand(handle,OMX_CommandStateSet,type,0);

in VideoOMX::ChangeComponentState.

I already commented out the lines in audioomx.cc.

MartenR

Ok, then the reason is that something breaks the communication to the omx stuff on the videocore.
And it is not the audio change...
Well this will be difficult.

Marten

hondansx

#19
@Marten

Attached is a debug.log, where the client will hang if I jump from 50% to 20% of the recording, Log says that network logging is disabled.
Using latest git with protocollchanges.
This is reproducable with this recording.

Any Ideas?

GA-EP43 | headless | 1xCineS2 Dual | 1xSkystar 2.6D | VDR 1.7.37 
Frontend: 1xRasperry | 1xION3

MartenR

@ hondansx
This is something different. Can you please put log messages into " void Player::switchState(UCHAR toState, ULONG jumpFrame)" , maybe also restartatframe() there your deadlock occurs. But we have to investigate.

@davep
Your problem is caused, by an interuption of the audio and video feeding threads inside an omx command, we have to find out, where it occurs.
So can you please put log message in videomx and audioomx inside DeliverMediaPacket( so that we can see where the threads is forcefully exited, when the omx connection is damaged. (So in which command it was)

@both I thought these issues were closed, but it seems that I was wrong.
Marten.

hondansx

#21
Quote from: MartenR on November 01, 2012, 21:42:50
@ hondansx
This is something different. Can you please put log messages into " void Player::switchState(UCHAR toState, ULONG jumpFrame)" , maybe also restartatframe() there your deadlock occurs. But we have to investigate.

Attached is a new log (play.log), where I can see that it hangs on demuxer->put(threadBuffer + writeLength, thisRead - writeLength)
Is the error in DemuxerTS (dbg.log)?


    while(writeLength < thisRead)
    {   
      logger->log("Player", Log::DEBUG, "Put in");
      thisWrite = demuxer->put(threadBuffer + writeLength, thisRead - writeLength);
      logger->log("Player", Log::DEBUG, "Put out");
      writeLength += thisWrite;

      if (!thisWrite)
      {   
        // demuxer is full and can't take anymore
        threadLock();
        threadWaitForSignal();
        threadUnlock();
      }

      threadCheckExit();
    }
GA-EP43 | headless | 1xCineS2 Dual | 1xSkystar 2.6D | VDR 1.7.37 
Frontend: 1xRasperry | 1xION3

davep

Quote from: MartenR on November 01, 2012, 21:42:50
Your problem is caused, by an interuption of the audio and video feeding threads inside an omx command, we have to find out, where it occurs.
So can you please put log message in videomx and audioomx inside DeliverMediaPacket( so that we can see where the threads is forcefully exited, when the omx connection is damaged. (So in which command it was)

These are rather large functions and so difficult to instrument. Also it seems that adding more debug code reduces the likelihood of failure. However I do see a consistent signature of the failure in the log:

12:39:26.999358 [debug]  2031 BoxStack - Update called
12:39:27.000129 [debug]  2031 BoxStack - Locked for update
12:39:27.000760 [debug]  2031 BoxStack - Unlocked for update
12:39:27.000918 [debug]  2031 Timers - Starting cancel timer 0x2a9cb4 1, list size = 3
12:39:27.001107 [debug]  2031 Timers - Removed timer for 0x2a9cb4 ref 1
12:39:27.001844 [debug]  2031 Timers - Starting set timer 1
12:39:27.002931 [debug]  2031 Timers - Timer set for 0x2a9cb4 ref 1
12:39:27.003144 [debug]  2031 Timers - Starting set timer 1
12:39:27.004074 [debug]  2031 Boxx - Destruct, now 7
12:39:27.004224 [debug]  2031 Player - LOCKED
12:39:27.004301 [debug]  2031 Player - SKIP FORWARD 60 SECONDS
12:39:27.005327 [debug]  2031 Video - ++++DP0
12:39:27.005496 [debug]  2031 Video - ++++DP1
12:39:27.005573 [debug]  2031 Video - ++++DP2
12:39:27.005649 [debug]  2031 Video - ++++DP2a
12:39:27.006861 [debug]  2031 DemuxerTS - getFrameNumfromPTS pts 3288940265 deleted 21 difference 0
12:39:27.007045 [debug]  2031 Player - Switch state from 1 to 7
12:39:27.007127 [debug]  2031 VFeed - Stop1
12:39:27.008377 [debug]  2031 VFeed - Stop2
12:39:27.008582 [debug]  2031 AFeed - Stop1
12:39:27.010694 [debug]  2031 AFeed - Stop2

This pattern of video calls immediately after the skip does not appear if the skip is successful. Here is the section of code in videoomx.cc:

Log::getInstance()->log("Video", Log::DEBUG, "++++DP2");

        if (!cur_input_buf_omx) {
                input_bufs_omx_mutex.Lock();
                if (input_bufs_omx_free.size()==0) {
                        input_bufs_omx_mutex.Unlock();
                        //Log::getInstance()->log("Video", Log::DEBUG, "Deliver MediaPacket no free sample");
Log::getInstance()->log("Video", Log::DEBUG, "++++DP2a");
                        return 0; // we do not have a free media sample

                }
Log::getInstance()->log("Video", Log::DEBUG, "++++DP2b");
                cur_input_buf_omx=input_bufs_omx_free.front();
                cur_input_buf_omx->nFilledLen=0;
                cur_input_buf_omx->nOffset=0;
                cur_input_buf_omx->nTimeStamp=intToOMXTicks(0);
                input_bufs_omx_free.pop_front();
                input_bufs_omx_mutex.Unlock();
        }

Log::getInstance()->log("Video", Log::DEBUG, "++++DP3");

so it seems that the function returns successfully?

MartenR

@davep
Yes, on this run the function returned sucessfully (if it was not entered) again.
Hmm, maybe the audio thread is causing troubles.
Or the point, that there are no free buffers and deallocateomx can not emit a buffer with EOS???

No real clue.

Marten

MartenR

#24
@ hondansx
it is unlikely that the demuxerts code is making something.
I suspect, that at the endit will hang at putting it to the stream.

I am sorry but I am afraid, that you have to do the same inside demuxerts (I suspect submitpacket and  then in the end putting it to the stream).

Marten

MartenR

@davep
Please also add to audioomx, log message. I do not think that the videoomx thread causes problems, since it exits cleanly.

davep

Quote from: MartenR on November 02, 2012, 21:37:53
@davep
Please also add to audioomx, log message. I do not think that the videoomx thread causes problems, since it exits cleanly.
I did log audioomx but it seemed that the video function was the last to be called. However I've now added logging to the entry and to every exit point and it seems that there may be a problem:
13:29:50.929267 [debug]  2126 Remote - CECLOG: 42622 8 >> 04:44:04
13:29:50.929566 [debug]  2126 Remote - CECLOG: 42622 4 >> TV (0) -> Playback 1 (4): user control pressed (44)
13:29:50.929754 [debug]  2126 Remote - CECLOG: 42622 16 key pressed: right (4)
13:29:50.934198 [debug]  2126 Audio - ++++IN
13:29:50.934417 [debug]  2126 Audio - ----4
13:29:50.937258 [debug]  2126 VDR - RR sleep - opcode 7
13:29:50.939620 [debug]  2126 Audio - ++++IN
13:29:50.939771 [debug]  2126 Audio - ----4
13:29:50.941122 [debug]  2126 Timers - Timer firing for client 0x25a07c ref 2
13:29:50.942435 [debug]  2126 Boxx - Construct, now 8
13:29:50.943731 [debug]  2126 VVideoRec - Draw bar clocks
13:29:50.945727 [debug]  2126 Audio - ++++IN
13:29:50.947276 [debug]  2126 DemuxerTS - getFrameNumfromPTS pts 6878675788 deleted 8 difference 0
13:29:50.948233 [debug]  2126 VVideoRec - 0:09:25 / 1:00:50
13:29:50.948914 [debug]  2126 BoxStack - Update called
13:29:50.949016 [debug]  2126 BoxStack - Locked for update
13:29:50.953240 [debug]  2126 VDR - Rxd a response packet, requestID=609, len=100000
13:29:50.954256 [debug]  2126 VDR - RR unsleep
13:29:50.956589 [debug]  2126 BoxStack - Unlocked for update
13:29:50.957784 [debug]  2126 Timers - Starting cancel timer 0x25a07c 1, list size = 3
13:29:50.957936 [debug]  2126 Timers - Removed timer for 0x25a07c ref 1
13:29:50.958998 [debug]  2126 Timers - Starting set timer 1
13:29:50.959302 [debug]  2126 Timers - Timer set for 0x25a07c ref 1
13:29:50.959408 [debug]  2126 Timers - Second rollover - CHECK FIXME
13:29:50.959483 [debug]  2126 Timers - Starting set timer 1
13:29:50.959566 [debug]  2126 Boxx - Destruct, now 7
13:29:50.959649 [debug]  2126 Player - LOCKED
13:29:50.962920 [debug]  2126 Player - SKIP FORWARD 60 SECONDS
13:29:50.961781 [debug]  2126 Timers - sending timer to 0x25a07c with parameter 2
13:29:50.963544 [debug]  2126 VVideoRec - Draw bar clocks
13:29:50.964785 [debug]  2126 DemuxerTS - getFrameNumfromPTS pts 6878677408 deleted 8 difference 0
13:29:50.964960 [debug]  2126 Player - Switch state from 1 to 7
13:29:50.965044 [debug]  2126 VFeed - Stop1
13:29:50.968532 [debug]  2126 DemuxerTS - getFrameNumfromPTS pts 6878677768 deleted 8 difference 0
13:29:50.968804 [debug]  2126 VVideoRec - 0:09:25 / 1:00:50
13:29:50.969084 [debug]  2126 VFeed - Stop2
13:29:50.969206 [debug]  2126 AFeed - Stop1
13:29:50.969924 [debug]  2126 BoxStack - Update called
13:29:50.970044 [debug]  2126 BoxStack - Locked for update
13:29:50.972577 [debug]  2126 AFeed - Stop2
13:29:50.974870 [debug]  2126 BoxStack - Unlocked for update
13:29:50.975064 [debug]  2126 Timers - Second rollover - CHECK FIXME
13:29:50.975146 [debug]  2126 Timers - Starting set timer 1
13:29:50.975228 [debug]  2126 Timers - timerEventFinished for 0x25a07c
13:29:50.975306 [debug]  2126 Timers - timerEventFinished RESTART for 0x25a07c

The ++++ lines are the function entry and ---- marks the exits. It seems that we do enter just before the skip and never exit.

I will try to narrow down the point of failure, but this is a complex function with many code paths and may take a while...

MartenR

QuoteThe ++++ lines are the function entry and ---- marks the exits. It seems that we do enter just before the skip and never exit.
That is exactly what I want to know. The audio thread is cancel, so the function can be exited at any command through a cancelation point and if the cancelation point is inside an omx call, the communication with the video hardware is lost. So I need to know through which command the audio thread is exiting so that I can prevent this.

Marten

davep

Not as difficult as I feared  8)

The failing line is

     error = vw->ProtOMXEmptyThisBuffer(omx_aud_rend/*dec*/, cur_input_buf_omx);

at the end of the function. I suspect there is a timing issue somewhere as the failure changed depending on how many logging statements were included - in some cases the hang problem disappeared completely. This is the output with one log statement immediately before the above line:

15:54:15.905686 [debug]  2028 Remote - CECLOG: 64502 16 received data: header:00030004 p0:00044404 p1:00000000 p2:0000000
0 p3:00000000 reason:4
15:54:15.906636 [debug]  2028 Remote - CECLOG: 64503 8 >> 04:44:04
15:54:15.906842 [debug]  2028 Remote - CECLOG: 64504 4 >> TV (0) -> Playback 1 (4): user control pressed (44)
15:54:15.906996 [debug]  2028 Remote - CECLOG: 64504 16 key pressed: right (4)
15:54:15.913810 [debug]  2028 Boxx - Construct, now 8
15:54:15.909163 [debug]  2028 VDR - Rxd a response packet, requestID=1318, len=100000
15:54:15.914351 [debug]  2028 VVideoRec - Draw bar clocks
15:54:15.915965 [debug]  2028 DemuxerTS - getFrameNumfromPTS pts 6051208009 deleted 15 difference 0
15:54:15.916198 [debug]  2028 VVideoRec - 0:15:56 / 1:00:28
15:54:15.916888 [debug]  2028 BoxStack - Update called
15:54:15.917002 [debug]  2028 BoxStack - Locked for update
15:54:15.917506 [debug]  2028 BoxStack - Unlocked for update
15:54:15.917635 [debug]  2028 Timers - Starting cancel timer 0xa638ac 1, list size = 3
15:54:15.917737 [debug]  2028 Timers - Removed timer for 0xa638ac ref 1
15:54:15.918042 [debug]  2028 VDR - RR unsleep
15:54:15.904278 [debug]  2028 Timers - sending timer to 0xa638ac with parameter 2
15:54:15.924730 [debug]  2028 VVideoRec - Draw bar clocks
15:54:15.925799 [debug]  2028 Timers - Starting set timer 1
15:54:15.926074 [debug]  2028 Timers - Timer set for 0xa638ac ref 1
15:54:15.926177 [debug]  2028 Timers - Second rollover - CHECK FIXME
15:54:15.926252 [debug]  2028 Timers - Starting set timer 1
15:54:15.926330 [debug]  2028 Boxx - Destruct, now 7
15:54:15.926410 [debug]  2028 Player - LOCKED
15:54:15.926483 [debug]  2028 Player - SKIP FORWARD 60 SECONDS
15:54:15.928447 [debug]  2028 Audio - ****9
15:54:15.929584 [debug]  2028 DemuxerTS - getFrameNumfromPTS pts 6051209179 deleted 15 difference 0
15:54:15.929759 [debug]  2028 Player - Switch state from 1 to 7
15:54:15.929843 [debug]  2028 VFeed - Stop1
15:54:15.930666 [debug]  2028 VFeed - Stop2


MartenR

#29
I was also on the hunt of deadlocks, I also found this line.
Please try the following changed code in videoomx and please confirm that at least this navigation deadlock issue is gone: (there might be multiple deadlock, so a new one might appear!)
void VideoOMX::DeinterlaceFix()
{
int oldcancelstate;
int oldcanceltype;
Demuxer* demux=Demuxer::getInstance();
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &oldcancelstate);
pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, &oldcanceltype);
clock_mutex.Lock();
OMX_ERRORTYPE error;
OMX_PARAM_PORTDEFINITIONTYPE port_def_type;
memset(&port_def_type,0,sizeof(port_def_type));
port_def_type.nSize=sizeof(port_def_type);
port_def_type.nVersion.nVersion=OMX_VERSION;
port_def_type.nPortIndex=omx_codec_output_port;

error=OMX_GetParameter(omx_vid_dec,OMX_IndexParamPortDefinition, &port_def_type);
if (error != OMX_ErrorNone) {
Log::getInstance()->log("Video", Log::DEBUG,
"OMX_IndexParamPortDefinition fix failed %x", error);
clock_mutex.Unlock();
return;
}

if (port_def_type.format.video.nFrameWidth == demux->getHorizontalSize()
&& port_def_type.format.video.nFrameHeight == demux->getVerticalSize()){
Log::getInstance()->log("Video", Log::DEBUG,
"Deinit first frame fix");
deint_first_frame=false;

WaitForEvent(omx_vid_dec,OMX_EventPortSettingsChanged);
DisablePort(omx_vid_dec,omx_codec_output_port,false);
DisablePort(omx_vid_sched,omx_shed_input_port,false);
DisablePort(omx_vid_deint,omx_deint_output_port,false);
DisablePort(omx_vid_deint,omx_deint_input_port,false);

port_def_type.nPortIndex=omx_deint_input_port;
error = OMX_SetParameter(omx_vid_deint, OMX_IndexParamPortDefinition,
&port_def_type);
if (error != OMX_ErrorNone) {
Log::getInstance()->log("Video", Log::DEBUG,
"Set OMX_IndexParamPortDefinition1 failed %x", error);
clock_mutex.Unlock();
pthread_setcancelstate(oldcancelstate, NULL);
pthread_setcanceltype(oldcanceltype, NULL);
return;
}

port_def_type.nPortIndex=omx_deint_output_port;
error = OMX_SetParameter(omx_vid_deint, OMX_IndexParamPortDefinition,
&port_def_type);
if (error != OMX_ErrorNone) {
Log::getInstance()->log("Video", Log::DEBUG,
"Set OMX_IndexParamPortDefinition2 failed %x", error);
clock_mutex.Unlock();
pthread_setcancelstate(oldcancelstate, NULL);
pthread_setcanceltype(oldcanceltype, NULL);
return;
}


EnablePort(omx_vid_dec,omx_codec_output_port,false);
EnablePort(omx_vid_deint,omx_deint_input_port,false);
EnablePort(omx_vid_deint,omx_deint_output_port,false);
EnablePort(omx_vid_sched,omx_shed_input_port,false);
}
clock_mutex.Unlock();
pthread_setcancelstate(oldcancelstate, NULL);
pthread_setcanceltype(oldcanceltype, NULL);




}

Edit: I have put this and some other fixes to git. Maybe it helps, but other problems might remain. Look if it is fix and if not do the diagnosis as described before. The problem is that these problems highly depend on timing, so it is very difficult to reproduce them on other machines with other network setups.