- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
timing
交互式应用包括从实时游戏到飞行模拟器和虚拟现实环境的所有内容,对整个计算环境提出了很强的实时性要求,以确保在正确的时间向用户提供正确的数据。这需要两件事:第一件事是应用程序知道显示信息的时间,以便能够计算出正确的内容;第二件事是在该时间实际显示帧。
这两条信息通过图形堆栈进行不一致的管理,使得应用程序很难为用户提供流畅的动画体验。而且由于在使用opengl或vulkan进行应用程序渲染和底层硬件之间存在许多api,因此在链的任何一点上都不能正确地处理事情都会导致结果混乱。
要解决这个问题,需要在整个系统中进行更改,从使内核提供更好的控制和有关图像排队和呈现的信息到在合成窗口系统中进行更改,以确保在目标时间显示图像,并确保实际呈现时间向应用程序和
最后,添加到呈现api(如vulkan)中,以显示对图像呈现时间的控制,并反馈图像最终对用户可见的时间。
本演示将首先演示当前图形堆栈中固有的显示计时支持不足的影响,讨论系统各个级别所需的不同解决方案,最后展示工作系统。
展开查看详情
1 .Improving Graphics Interactivity It's all in the Timing Keith Packard keithp.com Valve
2 . Introduction ● What do we want? – Every frame displayed precisely when the application wants it. – Constant frame rate. ● Why is this hard? – Lots of moving parts: ● application scene changes ● compositing environment changes ● power/thermal management – Asynchronous processing ● Applications queue rendering to GPU ● Display must wait for GPU completion
3 .Direct with Flip
4 .Direct with Copy
5 .Missing a Frame
6 .Displaying a Frame Early
7 . Requirements ● Tell apps when vblank will be ● Allow apps to specify when frames should be displayed ● Get frames displayed on time ● Tell apps when frames were displayed – And when rendering was complete, in the same time domain
8 . OpenGL ● GLX_OML_sync_control – Specify target present frame count – Avoids early frame presentation ● But, no feedback about when frames were actually presented – Many kludges required to guess ● GLX_EXT_swap_control – Sets (min) number of frames per presentation – No feedback on actual presentation time.
9 . Vulkan ● GOOGLE_display_timing – Specify absolute (CLOCK_MONOTONIC) time for frame – Feedback about when frames were presented ● May be delayed by a long time (but not with Mesa). ● Clocking application rendering – Best practice today uses vblank fences ● Using EXT_display_control ● Which only works for direct display ● And doesn't say whether a present happened – Want something triggered by present ● That turns out to be hard to specify ● EXT_calibrated_timestamps – Get GPU/OS clocks values for the “same time” – Allows conversion between GPU and OS time domains
10 . Old Vulkan Loop frame_step = 16.67 ms current_time = 0 while(running) { RenderFrame(frame_step); current_time += frame_step; PresentFrame(); frame_step = LengthOfThisFrame(); }
11 . New Vulkan Loop frame_step = 16.67 ms current_time = 0 while(running) { RenderFrame(current_time); current_frame_id = PresentFrame(current_time); history = QueryFrameInfos(); frame_step = FrameTimingHeuristics(history); current_time += frame_step; }
12 . X ● Present extension spec is ready – Specify target frame for PresentPixmap – Provides feedback on when PresentPixmap was processed ● But the implementation lags – When the desktop is composited
13 .X with Flip
14 .X with Copy
15 .Ideal Composited
16 .Current X Composited
17 . Current X Compositing Process ● Each app rendering request generates damage events to compositor ● Compositor collects damage ● At 'suitable time', compositor draws and calls PresentPixmap
18 . Simple X Kludge ● Send damage immediately at PresentPixmap time – Compositor can start building the next frame immediately ● Send Present event at next vblank – Assumes that compositor succeeded ● This fixes the frame delay – Most of the time – When the system isn't busy – When the app doesn't ask for a delay
19 . Slightly better X Kludge ● Pend Damage in X server until 'the right time' ● Deliver damage to compositor. Remember which damage was sent. ● Send Present events to apps at the same time we send Present event to compositor ● No changes in compositor required
20 . Principled X Fix ● Mark damage events with sequence numbers ● Change compositor to notify X server which damage sequence is processed by PresentPixmap ● X server can associate compositor PresentPixmap with app PresentPixmaps and deliver correct events.
21 . When is The “Right Time”? ● At some fixed point in the frame? – But compositor operations may vary ● At the latest possible point in the frame? – Estimate compositor time based on previous frames and amount of change? ● When InputFocus app calls PresentPixmap? – Likely to be where the user is working – Have the app inform on plans? – Estimate based on previous app actions? – Fallback to other method, or drop frame?
22 . Linux Flip API ● Current API is awkward – Finite event limit in kernel mixes flips and vblank notifies – Applications must work-around in user space ● Test for failure, attempt to empty pending events, retry – Times in µS instead of nS ● Doesn't match Vulkan time precision ● Single queue spot – Queue other buffers in user space ● No 'unqueue' – Commit to planned frame up front ● Blocks waiting for rendering(?) – The non-atomic path does – And I think the atomic does as well. ● Cannot actually support “Mailbox” mode.
23 . Queue without blocking ● Kernel can move to HW when rendering completes. ● Allow user space to continue. ● Alternative is to have user space take an event and delay queuing until then.
24 . Multiple flips queued ● For same frame – Kernel picks last one ready at vblank – Idles (and notifies) when possible ● For future frames – Allow user space to go idle for longer
25 . Cancel queued entries ● Useful when queued for many future frames – avoid displaying from terminated apps ● Necessary if we don't get multi-queue – Handle all of that from user space
26 . Summary ● Extend Vulkan to expose existing X capabilities ● Fix timing under composited X ● Enhance Linux flip API – Make flips more reliable – Support Mailbox mode – Provide ns resolution
27 .Thanks! Keith Packard keithp@keithp.com Valve