2016年12月22日 星期四

Improve the performance of far seek for EXOPlayer

ExoPlayer

Issue report = https://github.com/google/ExoPlayer/issues/2253

Check in:

Merged #2318.


Environment:


1.     [Movie]: 4K timer MPEG DASH streaming from https://www.youtube.com/watch?v=uo9dAIQR3g8.
2.     [Device]: HTC One X9u with Android 6.0 API = 23.
3.     [Code Base]: Branch = release-v2


Issue description:


Here we check the side of upstream and see if we could do optimize where.



fig 1: overview of the player pipeline

Fig 1 is the overview of the player pipeline. Please refer to http://programmingmemojohnchang.blogspot.tw/2016/12/improve-performance-of-short-seek-of.html.


fig 2

As figure 2, the rectangle with a dashed border is the latest media chunk we download. It has not been completely downloaded yet so we represent it by dashed border.

Originally if the seek target is behind sampleQueue at the upstream side, we will clean up the whole sampleQueue. Then do refetch from the media chunk which contains the search target.
The refetch media chunk is exactly the same as the last media chunk in dashed border.

Obviously, there is a waste in downloading redundant data as the part shown in green. As the result, if we keep the the data of the last partially downloaded media chunk, it could save some. 

If the conditions below are satisfied, we could try to locate the last key frame within sampleQueue and drop the out-of-range (decoded only) samples at renderer.
1.     The search target locates within the last media chunk.
2.     The nearest key frame precedes the search target has been either within sampleQueue or sent to decoder (or rendered).

To take use of this fact, we create a new function named skipToLastKeyframe(). It tries to find the last key sample within the sampleQueue. Also, we create a function named isWithinLastChunk() to check whether the seek target locates within the last media chunk.


Test environment:


1. [Movie]: 4K timer MPEG DASH streaming from: 
https://www.youtube.com/watch?v=uo9dAIQR3g8
2. [Device]: HTC One X9u with Android 6.0 API = 23.
3. [Code Base]: Branch = release-v2
4. Fix the bitrate to the max one = 22361348 bits/sec.


Test result:



fig 3


fig 4


fig 5: comparison


The experimental result is analyzed here.

Here we explain the meanings of the result within fig 3 & 4, and how we decide the test condition.


How we choose the target seek?

Every time triggering a seek, we call getBufferedPositionUs() to get the buffered position =  T and seek the T+(50ms). It insures the seek target always be out of the range of sampleQueue.
The example is as below (I put it in seekToInternal()).

    if (Util.upstreamOptimizationExploration) {
      /*In most of the cases video will have min buffered data*/
      periodPositionUs = loadingPeriodHolder.mediaPeriod.getBufferedPositionUs() + 50000 /*50ms*/;
    }


1. [search target,  media chunk start] 
represents the average distance between the seek target and the start of last media chunk when doing test.
2. data loss
represent how much data we flushed from the beginning of last media chunk to the end of sampleQueue.
3. AllRenderersReady:
How much time it spends to all renderers are ready (all of them have rendered the first frame) after seek is delivered. At this time you could see the 1st frame of the seek target time but is still NOT ready to play. 
4. HaveSufficientBuffer:
At this time the playback actually starts.
5. network speed:
The network speed when we probe the improvement performance.

Explanation to the result:
By uniform distribution, since a segment typically is ~ 5.2 seconds, in average the seek target will be away from the start of the containing media chunk by ~ 2.6 seconds. Therefore, the item (1) & (2) match with the fact.

(3) could also be explained by the fact that the bitrate is not uniformly distributed within a segment. Since the key frame at the front of a media chunk is usually larger than other samples, the bitrate distribution of the first half part is larger than the second half one. Hence the time spent in downloading the first half part will be greater than the time spent in downloading the other part.

(4) reflects the relationship between network speed and download bitrate by
(22361348 * 2.6) / 19380250 ~ 2.999935, compared to 2.914 of item 4.

Finally, the experimental result shows we can improve about:
1.     3162.865 us to render the first frame.
2.     1233.17 us to start playback.

Totally, it improves ~ 4396 us for the 4K test case, or speed up by:
((4228.225+2914) /(1065.36+1680.33)) ~ 252.5%.

Figure 6 & 7 summarizes the difference between original & proposed scheme.
The saving gain could be understood there.





fig 6: original method - redownload the last chunk 





fig 7: proposed method - avoid the redundant download

沒有留言:

張貼留言