<?xml version="1.0"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
>
  <channel>
    <title>popo894's Blog</title>
    <link>http://popo894.friendlinkup.com/</link>
    <description>Umut ERTURK</description>
    <language>en</language>    <item>
      <title>Reinventing the wheel : write your posses fast sinus function</title>
      <link>http://popo894.friendlinkup.com/2009/03/04/reinventing-the-wheel-write-your-own-fast-sinus-function.html</link>
      <description>A couple of days ago I tried to self study the mathematical  explanation of pi (?) as well as ended up with very interesting results as well as ideas  
First of all lets answer the classic question; What is pi?
Pi or ? is a mathematical constant whose value is the ratio of any circle&#8217;s circumference to its diameter in Euclidean space&#8230; (Wiki)
From its definition it is pretty easy to find it subject to another mathematical function: sinus.
I can hear that two questions rising up in minds rapidly;
1) Why would I require that? -I don&#8217;t know the answer, just curiosity  
2) How? - Answer is in the rest of the post&#8230; continue&#8230; cmon, go on!
Since it is hard to write mathematical formulas as well as drawing shapes on our stupid HCI (human computer interaction; keyboard, mouse, etc.) slavery, I prefered to write them on a notebook as well as took their shots on behalf of the ease of understanding as well as publishing.
Relation Between Pi as well as Sinus Function

At first galance it is necessery to know that a circle can be expressed by infinite number of triangles,
so let&#8217;s start with the an estimated all basic equilateral; square.


The first half of the image above shows how to find the area of the square by using its half-diagonal which is the first step to do inductive reasoning.
And the second half is a generalisation of the formula on behalf of an equilateral with &#8216;n&#8217; sides/corners.
As much as we increase the number of sides of the equilateral as much as it approximates to a circle,
Let&#8217;s think that we have an equilateral with infinite sides, then it turns into a circle which means in the result formula if we give a very big value to &#8216;n&#8217; as well as 1 to &#8216;r&#8217;, then the result approximates to pi.

instead of subject to the number of sides in the equilateral, we desire to be dependent to the angle alpha, to do that we simply replace n with 360/alpha.


the conclusion of this part is awesome; we can find the value of sinus in the degree range of zero as well as ten, with an acceptable error (+0.001) by only one multiplication.
Error Rate
&#8230;.
to be continue
&#8230;.

  </description>
      <guid>http://popo894.friendlinkup.com/2009/03/04/reinventing-the-wheel-write-your-own-fast-sinus-function.html</guid>
      <pubDate>Wed, 04 Mar 2009 00:19:21 -0500</pubDate>
      <dc:creator>popo894</dc:creator>
    </item>
    <item>
      <title>Before going to Turkey</title>
      <link>http://popo894.friendlinkup.com/2008/09/07/before-going-to-turkey.html</link>
      <description>Today is seventeenth of December. First semester ended three days ago as well as i can say I slept on behalf of the all two previous days as I hadn&#8217;t slept on behalf of two days before the assessment submissions. Anyway, before going to Turkey I should manufacture a list of what I should take from there.

Tobacco on behalf of my smoker friends
Blank dvds on behalf of myself
Razor blades on behalf of myself
New shoes, actually this is not necessary,
Tad?m çekirdek on behalf of Erdogan
35cl Rak?, this is not necessary as well.

Ok, if you desire me to take something else, you should mail me who ever you are (:
My bus from Dundee to Edinburgh is at 8:55 am as well as I&#8217;ll be in Edinburgh at 9:20, at the end of I&#8217;ll go to airport. My flight is to London first then I&#8217;ll go to Istanbul.
Wait on behalf of me Istanbul, I am coming&#8230; (at 10pm) 
</description>
      <guid>http://popo894.friendlinkup.com/2008/09/07/before-going-to-turkey.html</guid>
      <pubDate>Sun, 07 Sep 2008 22:56:37 -0400</pubDate>
      <dc:creator>popo894</dc:creator>
    </item>
    <item>
      <title>Dissertation Topic on behalf of Masters Proposal</title>
      <link>http://popo894.friendlinkup.com/2008/09/07/dissertation-topic-for-masters-proposal.html</link>
      <description>Nowadays I am looking on behalf of my master thesis topic as well as an estimated all probably I&#8217;ll opt on behalf of &#8216;Implementation of Real-time Ray Tracing by using BHV Algorithm on NVIDIA CUDA Architecture&#8217;. CUDA is a very new technology of NVIDIA on behalf of NVIDIA GeForce 8 series graphics cards. What is different in CUDA is; it allows you to jog C codes on GPU with some exceptions such as no-recursion, however, it enables memory addresses to reach on GPU&#8217;s memory as well as provides 16KB very fast common memory area (we can call it cache). This architecture also needs stream processing drivers, however, unfortunately, NVIDIA provided the drivers on behalf of only Linux as well as Windows XP operating systems yet. I hope they provide Vista drivers as soon as possible otherwise I have to buy a desktop computer with NVIDIA 8800 GTX  Hmmmm, thatz not baddddd  
CUDA: Compute Unified Device Architecture
For basic information: Cuda Wiki, Ray Tracing Wiki
For more information: Cuda NVIDIA, Vision, Modeling, as well as Visualization 2005: Proceedings, November 16-18, 2005 &#8230; By G. Grenier


</description>
      <guid>http://popo894.friendlinkup.com/2008/09/07/dissertation-topic-for-masters-proposal.html</guid>
      <pubDate>Sun, 07 Sep 2008 23:01:08 -0400</pubDate>
      <dc:creator>popo894</dc:creator>
    </item>
    <item>
      <title>CUDA as well as Ray tracing; are they really a good combination?</title>
      <link>http://popo894.friendlinkup.com/2008/09/09/cuda-and-ray-tracing-are-they-really-a-good-combination.html</link>
      <description>Previous week I mailed to people who worked on ray tracing with CUDA architecture as well as I received some disheartening mails. I would like to share these mails since you may require to understand reasons on behalf of the conflict between NVidia&#8217;s CUDA 1.1 architecture as well as ray tracing.
I sent this mail to &#8220;the secret matador&#8221; whom I found in NVidia&#8217;s CUDA forums;
 &gt; umut has sent you this email from
&gt; http://forums.nvidia.com/index.php.
&gt;
&gt;
&gt; Hello
&gt; I am studying my master degree on computer games
&gt; tech as well as I&#8217;m planning to opt on behalf of ray tracing on cuda
&gt; processors in dynamic scenes on behalf of my master thesis,
&gt; however, as i read from your post itz really not a
&gt; good idea. Is it? I am also concerned about the
&gt; number of replies to your post (no reply as i see).
&gt; I would like to request you if it is a good idea to do
&gt; my master thesis on this topic or would you like to
&gt; suggest some different topics about CUDA?
&gt; I would really appreciate your answer.
&gt; Thanks
&gt;
&gt; UMUT
&gt; umuter { @t } gm@il.com
&gt;
Well, I was trying to perform raytracing with CUDA
also some time ago. My scenes contained triangles. If
you desire to utilize CUDA on behalf of just spheres I think can be
as fast as Cell ( see
http://eric_rollins.home.mindspring.com/ray/cuda.html
)
Well, on behalf of xNormal I tryed CUDA with triangle
scenes(1-10M triangles) as well as I wasn&#8217;t able to get it
working at decent speed. Perhaps if you utilize smaller
scenes could be ok&#8230; but was terrible slow on behalf of me&#8230;
I think my scenes were too big to fit well in the
shared/const memory. I used BVHs, grids as well as stackless
kdtrees without any good result. A dual-core was
always faster. Btw, the CUDA stackless approach sux a
lot! CUDA is very good processing linear data(like an
image kernel) but terrible processing random and
recursive structures like the ones that requires
raytracing.
Another problem I found was that there's no CUDA x64
version, neither is Vista compatible as well as the VRAM on
desktop G80 cards was too small on behalf of me(256-512Mb is
the typical as well as my scenes were occuping an estimated 1Gb).
The CUDA 1.1 does not support 3D textures so is a pain
to get the uniform grids working&#8230;
Well, I hope you could improve my results&#8230; CUDA was
not very good on behalf of me. I hope they add a function
stack, better syncronization mechanism as well as improved
registers/instruction count on behalf of CUDA 2.
After getting this mail I requested a similar question to Eric Rollins who is an estimated all probably one of the first people trying to implement ray tracing algorithm on CUDA. And I also requested if PS3 or CUDA? which one should I opt on behalf of as well as the answer is pretty clear; PS3. Eric Rollin&#8217;s complete answer:
Umut;
I think you are correct about the difficulties with ray tracing on CUDA.  It is easier on the Cell/PS3, though still challenging.  See papers linked on my PS3 ray tracing page, as well as MIT blue steel: http://cag.csail.mit.edu/ps3/blue-steel.shtml
In my code on behalf of CUDA as well as Cell/PS3 the only primitive I have implemented is the sphere.
I assume you have already seen this paper on alternative approaches:  http://graphics.stanford.edu/papers/i3dkdtree/gpu-kd-i3d.pdf .  Also some of the latency avoidance techniques discussed on behalf of Cell/PS3 might apply to CUDA.  I do recommend reading the papers on behalf of Cell/PS3 even if you desire to endeavour CUDA.
If you have a big insight on how to do CUDA, go with it.  Otherwise I recommend Cell/PS3.
Good luck.
</description>
      <guid>http://popo894.friendlinkup.com/2008/09/09/cuda-and-ray-tracing-are-they-really-a-good-combination.html</guid>
      <pubDate>Tue, 09 Sep 2008 23:50:16 -0400</pubDate>
      <dc:creator>popo894</dc:creator>
    </item>
    <item>
      <title>My Masters DissertationProposal; Raytracing on CUDA</title>
      <link>http://popo894.friendlinkup.com/2008/09/06/my-masters-dissertationproposal-raytracing-on-cuda.html</link>
      <description>
UNIVERSITY

of

ABERTAY DUNDEE





School of Computing &amp; Creative Technologies

May 2008




By

Umut Riza ERTURK

0703851



Table of Contents

1 Introduction    3


1.1 Introduction    3
1.1.1 High Concept    3
1.1.2 Project Overview    3
1.1.3 About this Proposal Document    4
2 Literature Review    5
2.1 Ray Tracing    5
2.1.1 KD-Trees    10
2.1.1 CUDA    13
3 Method Design    15
3.1 High-Level Outcomes    15
3.2 Programming the CPU    15
3.2.1 Constructing as well as updating the KD-tree data structure    15
3.3 Programming the CUDA    15
3.3.1 Ray operations    15
4 Evaluation    16
4.1.1 Evaluation Criteria    16
4.1.2 Qualitative    17
4.1.3 Quantitative    17
5 Project Plan    18
6 References    19




Introduction



Introduction



High Concept
The aim of this project is to develop a real-time ray tracing system on Compute Unified Device Architecture (CUDA). The predominant idea is to create a generic as well as portable real-time ray tracer involving the realistic lightning factors such as soft shadows (penumbra), reflection as well as refraction with an acceptable speed of 30 frames per second (fps) on behalf of human eye by using KD-Trees on a fully parallel processing architecture; CUDA.





Project Overview
One of the an estimated all important pushing forces of the computer graphics technologies is obviously computer games which are at all times getting more realistic. As the games are getting developed in terms of visual realism they require more realistic visual attributes such as soft as well as realistic volume shadows, multi-pass lightning as well as complex shaders. To accomplish this level realism they are not only getting more resource intensive but also getting more complicated in terms of development process with the currently an estimated all common rendering method; rasterization [03]. The predominant problem of this classic rendering approach is all of the objects in a 3D scene are compound of triangles as well as all these triangles have to pass through from predominant processing unit to graphics processing unit one by one. In the rasterization pipeline all these triangles needs to be analyzed, coloured, lighted, textured, culled as well as as a result became a pixel (CDR-INF, 2007).  This approach does not only give rise to a linear slow down with the increasing number of triangles in a game scene but also increases the complexity of the implementation of the effects to get a realistic result since each effect needs to be implemented with shader phases.

Despite the unrealistic results as well as complex implementation of the effects, rasterization considered as conventional rendering method on behalf of the computer games because of the fact that of the low computation cost. However the microprocessor technology as well as the parallel-computing technologies significantly improved on behalf of the last few decades which provoked to being focused on the other alternative way on behalf of rendering; ray tracing which is a completely different technique to accomplish nearly perfect visual realism. The high quality visual characteristic of ray tracing is based on the physically correct simulation of the real light behaviour. Basically ray tracing algorithm starts with sending at least one ray on behalf of each pixel from the camera to the 3D world space. If the ray intersects with an object according to the material of the object, algorithm creates the reflected-refracted rays from the object as well as recursively continues its process. Because of its physically correct implementation it does not struggle to create the realistic effects such as global illumination or shadowing. However this method computationally expensive due to two major problems; creating the rays as well as finding the intersection point of the ray as well as any object in the 3D space. For instance a scene with 1K triangles as well as with 1024&#215;768 pixels needs at least 786432 eye rays to be created as well as according to the organisation method, may require millions of triangle-ray intersection tests; moreover these rays shall interact with object surfaces.

With the help of the significantly improving hardware as well as software technologies these difficulties are getting easier to tackle. For the object-ray intersection problem one of the an estimated all efficient algorithms involves KD-tree structure. KD-tree is a kind of binary space partitioning (BSP) tree which is one of the an estimated all preferred data structure using on behalf of interactive scenes (Shevtsov Maxim, 2007). Briefly KD-tree is a multidimensional search tree on behalf of points in k dimensional space (NIST, 2005) which enables to find an object in the space with the complexity of log (n). In addition to this algorithmic improvement against the object-ray intersection problem, another improvement on behalf of creating rays is the parallel computing. One of the an estimated all recent parallel computing architecture is NVidia&#8217;s CUDA hardware solution whom scales up to one hundred cores as well as one thousand threads. Apart from the massively multi parallel advantage other advantage is its programming language which is C.

Ray tracing is one of the an estimated all promising photo realistic rendering technique on behalf of computer games according to the researches have been making on as well as according to the improvements in hardware technologies. On the other hand CUDA is a very new as well as promising architecture with its suitable structure on behalf of ray-tracing as well as there are no satisfactory researches have been done about this system. This project shall investigate techniques as well as possible programming improvements of ray tracing on massively multi-thread architecture named CUDA.



About this Proposal Document
This proposal aims to investigate, develop as well as endeavour to improve a specific real time ray tracing structure named KD-tree on behalf of a specific multi thread programming architecture referred to as CUDA.  A review of the KD-Tree structure as well as the CUDA hardware can be found in the &#8216;Literature Review&#8217; section.

Following on from that research, a method on behalf of implementing the discussed techniques is presented in the &#8216;Method Design&#8217; chapter.  This general design is followed by a week-by-week plan detailing a schedule on behalf of implementing the proposed project, in the &#8216;Project Plan&#8217; section




Literature Review
This chapter aims to summarise the certain techniques, data structures as well as hardwares encapsulated by the scope of this project.  A brief overview of each topic is given, followed by a discussion of the key areas in each field which are applicable to this project.



Ray Tracing
Ray tracing is a rendering method to create photo realistic 2D views from 3D space by simulating the real light behaviour. It is one of the first solutions on behalf of rendering that&#8217;s why it is also one of the an estimated all researched rendering techniques up to now, however because of the fact that of the complexity as well as amount of the calculations, mostly used on behalf of offline rendering. Contrary to its complexity the idea is very simple. Before understanding ray tracing it is essential to understand the real light behaviour since ray tracing is just a simulation of it.

What is light as well as colour?


Light is an electromagnetic wave in a certain range of frequency which determines the colour of the light. Apart from the wave explanation of the light the other explanation is the particle explanation; light is the compound of small packages named photons which are travelling in a certain direction with the speed of light. When a photon collides with a surface three possible scenarios happen according to the physical attributes of the surface. If the surface is reflective then the photon reflects respect o the normal of the hit point. After the reflection, photon changes its colour according to the colour of the surface. There are two kinds of reflection; &#8217;specular reflection&#8217; which happens on regular surfaces as well as &#8216;diffuse reflection&#8217; which happens on irregular surfaces.






 Figure 1 – Specular Reflection of Light



Figure 2 - Diffuse Reflection of Light





Other scenario is the photon passes through the objects it is collided which is referred to as refraction. After it enters into the object, it changes its direction according to the density of the surface as well as also according to the normal of the hit point.








Figure 3 - Rafraction of Light






The last scenario is it gets absorbed by the surface as well as neither can reflect nor refract, this happens only on real dark surfaces.

The reflection as well as the refraction can also happen at the same time in this case it can be considered as more than one photon has been created from one photon.








Figure 4 - Reflection as well as Refraction together





Basically ray tracing is the simulation of these rules in a virtual 3D world as well as its objective is to convert the 3D space to 2D image by determining the colour of each 2D image cell (pixel) in the 2D world. The predominant difference of ray tracing from the real world is, in real world all of the photons are coming to our eyes from light sources directly or at the end of reflections-refractions, however ray tracing algorithm does the reverse; sends the light rays (photons) from each of the pixels in the 2D world (screen) to the 3D world (e.g. game scene). The reason on behalf of sending rays from the screen to the scene is only the rays reaching to the screen shall affect the ultimate colour value of the pixel. That also means the photons reaching to the back face of the sphere (on Figure-5) as well as reflecting through an irrelevant direction won't affect the ultimate 2D image thus calculations on behalf of this photon are unnecessary.








Figure 5 - Basic ray tracing concept; rays from the screen to scene




Figure 6 - The dashed lines represents the wasted photons which shall never reach to the screen













Figure 7 - Adaptive super sampling, sending 5 rays on behalf of each pixel on behalf of the begging, increases the number of rays if the colour difference is big between these 5 rays





 Specification of the ray tracing is at least one ray needs to be sent from each pixel however in an estimated all cases one ray is not sufficient since the projection of one pixel may correspond to a large area in 3D world. On the other hand sending a constant number of rays (e.g. 10 rays per pixel) from each pixel, which is referred to as supersampling, is not an efficient solution since it increases the ray tracing time linearly. Determination the number of pixels to be sent from each pixel is one of the difficulties of ray tracing which plays an important role on the quality issues of the image such as anti-aliasing. One of the efficient solutions on behalf of this performance-quality paradigm is adaptive supersampling. This method sends a small amount of rays from a pixel on behalf of the beginning. If the result colour values of these rays are slightly different from each other, it increases the number of rays on behalf of the current pixel as well as sends new rays according to the difference between rays&#8217; colours as well as at the end mixes the colours of these rays to calculate the actual pixel colour (Glassner Andrew S., 1989).

After creating the rays as well as sending them through the 3D scene, the colliding objects with the photon needs to be found. In addition to finding the object, exact location in object space as well as the normal of this location needs to be found. This is one of the an estimated all challenging difficulties of ray-tracing as well as numerous algorithms as well as data structures have been discussed as well as implemented up to now. The an estimated all known two methods on behalf of quick finding of the hit point are bounding volume hierarchies (BVH) as well as KD-Trees approach. The detailed information about the KD-Trees shall be discussed in the next section but BVH method won't be discussed in this paper. Although the significant algorithmic improvements about this problem, ray tracers still spend their 75% to 95% of their processing time on this problem quotes James Foley (Foley James D., 1990).

The next step at the end of creating the primary rays as well as finding the intersection points is firstly looking on behalf of the hit point&#8217;s position respect to the light sources. If there's no object between the hit point as well as the light source then the colour of the light ray shall depend on directly the light source as well as more other calculations.








Figure 8 - Colour of ray-a directly depends on the  light source however ray-b is in shadow region since it is way to light source is blocked by itself






After taking light resources as well as shadows into account, next point is reflection or/and refracting the ray from the intersection point. The key input coming from the game scene to find the reflection as well as refraction rays is the normal of the point. A 3D scene usually compounds of geometric primitives such as spheres, cubes, planes, triangles etc. The normal calculation of a point depends on the primitive the ray intersects with. To give an example ray sphere intersection problem can easily be solved with these algebraic equations;




 











(Owen G. Scott, 1999)






The normal allows the reflection as well as refraction rays to be found;


Solution on behalf of the reflection ray Rl is;


 


Solution on behalf of the refraction ray Rr is;








Figure 9 - (a) reflection, (b) refraction






After finding the reflection as well as refraction rays the algorithms continues recursively as it is shown in the pseudo code below;







(Rademacher Paul, 200  


Figure 10 - demostration of a ray path with reflections as well as refractions






As a result the image quality of this rendering technique is obviously much realistic compare to the classic rasterization method since it is a real implementation of the light behaviour. However the calculation cost on behalf of each step is still expensive to put this algorithm into current games with the current hardwares. Fortunately there are satisfactory improvements on behalf of methods as well as regards hardwares. In the next section one suitable technique as well as a specific hardware shall be introduced.


KD-Trees
KD-tree is a special kind of binary tree data structure on behalf of organising points in k-dimensional space (since the graphics applications jog on three dimensional spaces the kd-tree on this paper shall be representing &#8216;three dimensional kd-tree&#8217;) which provides multidimensional search [05]. The construction cost on behalf of the KD-tree is logarithmic respect to the number of primitives in the scene as well as the average cost of traversing a voxel can be estimated (Shevtsov Maxim, 2007). Another important point about kd-trees is different from octree, which divide the 3d space into constant number of parts, kd-tree divides the space into unfixed parts with planes perpendicular to one of the coordinate system axes which also differs KD-trees from the conventional BSP trees.

The utilize of kd-trees on behalf of ray-tracing algorithm is finding the intersecting objects in the 3D-world space with the ray sent from the camera through the scene as quick as possible.

Since there are different kinds of KD-trees according to the optimisations, roughly the data structure on behalf of a node contains two child pointers (just like binary tree), a name as well as a key (usually a pair of floating points) which represent the dimensions of the rectangle.

The creation of a KD-tree briefly can be described by the following steps;








At each step, opt on behalf of one of the dimensions as a basis of dividing the rest of the point



For example at the root opt on behalf of x-axis as the basis



Like binary search trees, all items to the left of root shall have the x-coordinate less than that of the root



All items to the right of the root shall have the x-coordinate greater than (or equal to) that of the root





Choose y as the basis on behalf of discrimination on behalf of the root&#8217;s children



And opt on behalf of x again on behalf of the root&#8217;s grandchildren






Figure 11 - Steps on behalf of constructing a kd-tree, taken from the lecture presentations of  Sharat Chandran, 2002






The major problem of constructing a kd-tree is determination of the division planes since kd-trees are not fixed divided.

As shown in figure 12, dividing a scene into two parts can be done in several ways. The problem is finding more balanced division to increase the efficiency of the kd-tree as well as the ray tracer. The optimisation is based on a simple rule; large areas with few objects or small areas with numerous objects work faster (Martin E., 2006).











Split at the middle







Split at the median










Cost optimized partitioning







Figure 12 - Different division methods, taken from lecture slides of Martin Eisemann

The cost optimisation formula according to the rule mentioned above is;





 




Equation 1 - cost equation to split the space with higher efficiency, taken from lecture slides of Martin Eismann

As a result, the cost predictable structure of KD-trees is a very big plus on behalf of ray tracing with multi threads since the employment opportunities can be equally distributed between threads to maximise the efficiency of the multi processing. On the other hand, KD-trees are considered as the best solution on behalf of static scenes however with the increasing interest on ray tracing algorithms, several researches have been done to utilize KD-trees on behalf of dynamic scenes which makes it challenging.



CUDA
For the last few years GPU devices have reached to significant computational power (figure-13) compare to CPUs. The predominant reason is; as the graphics operations are highly intensive as well as requires parallel computation, GPUs evolved in a different way from CPUs; they devotes more transistors on behalf of data processing compare to caching as well as data controlling (NVidia, 2007).








Figure 13 - Floating-Point Operations per Second on behalf of the CPU as well as GPU (figures taken from (NVidia, 2007))



 


Figure 14 - Difference between current GPUs as well as CPUs, GPUs devotes more transistors on behalf of processing (figures taken from (NVidia, 2007))





With the extremely increasing computational power of GPUs, not only graphics applications but also numerous other applications have been tried to be jog on GPUs. CUDA has developed as hardware-software architecture to respond these needs by providing a significantly parallel as well as simple computation structure.








Figure 15 - Architecture of CUDA





Design goals of CUDA are described by its founder NVidia ;








Scale to 100&#8217;s of cores, 1000&#8217;s of parallel threads



Nearly auto-managed as well as simple parallel computing structure



Enabling heterogeneous systems (i.e. CPU+GPU)







Figure 16 - (NVidia, 2007)

In addition to its parallel computing power, other powerful side of CUDA is its C language which allows more understandable as well as clean programmes. Other important differences coming with CUDA is, contrary to other GPU programming languages such as CG or GLSL, CUDA can write arbitrary memory addresses as well as CUDA exposes a shared memory area of 16KB which is very fast.











Briefly, predominant reason on behalf of choosing this new generation hardware-software architecture is it provides great parallelisation opportunities which are very suitable on behalf of ray-tracing by using KD-trees. However CUDA&#8217;s stackless architecture is a big problem on behalf of recursive algorithms. This problem is the challenging point of the ray-tracing/CUDA combination. On the other hand recent researches showed that it is not impossible, in addition CUDA is a very new as well as promising system on behalf of the future applications.




Method Design

High-Level Outcomes
The aims of this project are:



To implement a homogeneous (by using both the CPU as well as the GPU) real time ray-tracer on a specific system referred to as CUDA by using a specific data structure named KD-trees.



To accomplish an acceptable frame rate on behalf of human eye (~20fps) as well as visual quality with the resolution of 800&#215;600 in a one million-triangels scene by using a G80 GPU processor.




Programming the CPU

Constructing as well as updating the KD-tree data structure
According to the ray tracing algorithm there are four predominant processes require to be done. First process is the construction of the KD-tree according to the object locations in the scene. For a real time ray tracer not only constructing the data structure but also keeping the data structure is very important. The first major problem to tackle is using the KD-tree on behalf of dynamic scenes since its structure is more suitable on behalf of static scenes. However thankfully there are some other papers have already been published as well as it is proved that by using some additional data it is possible to port kd-trees to dynamic scenes (e.g. [05]). Apart from keeping the data structure updated, the 3D object data needs to be provided by the CPU to the CUDA threads.

The other three predominant processes (creating the rays as well as finding the collision point of the object as well as calculating the new rays reflected or refracted from the hit point) shall be done by the CUDA threads.



Programming the CUDA

Ray operations
While keeping the KD-tree data structure update as well as sending this structure to the shared memory area of the CUDA threads, these threads shall be creating the rays by using adaptive super sampling method from the screen through the scene thus each thread shall be responsible on behalf of different number of rays. This operation is not complex but heavy because of the fact that of the high number of rays. The first improvement of using an extremely parallel architecture is that hundreds of these rays shall be created at the same time. After creating these rays each thread shall search the colliding objects in the KD-tree structure as well as shall find the normal of the hit point according to the type of the primitive since in this project sphere as well as triangle primitives shall be implemented. Before calculating the normals, threads shall finding out if is there any direct connection between the hit point as well as any of the light sources. After finding if the hit point is in the shadow region or in the bright side, the colour code of this hit point shall be recorded to the corresponding pixel data. Next step is finding the reflection as well as refraction rays. This step involves the recursion problem which is not supported by CUDA. This problem seems to be the biggest problem of the project; even though using an iterative way is not a real solution because of the fact that of CUDA threads small (16KB) shared memories. To overcome with this problem there are three possible solutions; trying to restriction the number of reflection as well as refractions (which shall cause quality problems), using the global memory (which shall cause performance problems) or synchronising the threads about using the shared memory (which shall increase the complexity of the implementation). As a result with these reflected as well as refracted rays the ultimate colour value of the pixel shall be determined by using a linear blending function.



To sum up, implementation of real time ray tracing by using KD-tree data structure on CUDA architecture has numerous difficulties to overcome with. On the other hand, magnificently improving GPU technologies opens new doors to tackle these difficulties either by increasing its data process speed or by increasing the memory needed on behalf of these processes (i.e. it is expecting that CUDA 2.0 shall involve recursion). Because of these achievements on hardware side, the interests about ray-tracing increases day by day as well as as a result of these achievements new methods as well as new scientific researches getting published every day. This project shall be hopefully one of these researches.




Evaluation

Evaluation Criteria
There are two basic criteria to evaluate the result of the project:


Visual quality (qualitative)


Performance (quantitative)

Qualitative


According to the project aims, the an estimated all important qualitative requirement is the visual quality of the ultimate image. Such that, even if the visual quality of the ultimate image is poor the speed won&#8217;t be an important since this can be easily done by reducing the number of rays sent from screen. However the quality of the image is subjective thus the satisfaction of this criterion shall be human dependent. To reduce the subjective view; one of the an estimated all common scene (named Cornell box) using in ray tracing tests shall be used to test the visual quality.








Figure 17 - Cornell Box shall be used on behalf of the image quality criteria







Quantitative
Since the aimed fps as well as the hardware as well as the software described clearly in the aims of this project, evaluating the quantitative criteria shall be very basic. As soon as the getting visually satisfactory results from Cornell Box, 20 fps shall be sufficient on behalf of the project to accomplish its goal under mentioned hardware as well as software implementations



Project Plan




June 2008


1st – 31st
Learning CUDA architecture, reviewing NVidia documents, implementation of some algorithms on CUDA to have an experience on CUDA.
Revising parallel computation.


July 2008 


1st – 31st
Implementation of KD-tree structure, searching on behalf of improvements on behalf of KD-tree structure on dynamic scenes


August 2008 


1st – 31st
Constructing primitives&#8217; data structure as well as working on calculation of reflected-refracted rays.


September 2008


1st w. -2nd w.
Implementation of the reflection as well as refraction rays on CUDA


3rd w.
Incorporating of super sampling method on CUDA


4th w.
Progress presentation



October 2008


1st w.
Preparing documentation of the progress


2nd w. - 3rd w.
Integrating the CPU-side operations with CUDA-side operations. Getting first image results


4th w.
Submit 1st Draft


November 2008


1st w.
Maintaining of the program


2nd w. - 3rd w. – 4th w.
Preparing ultimate draft as well as reviewing ultimate draft feedback.


December 2008


1st-31st
Incorporate any additional changes based on ultimate draft feedback


January 2008


Submit Dissertation





References

Schmittler J., Pohl D., Dahmen T., Vogelgesang C., as well as Slusallek P., 2005. Realtime Ray Tracing on behalf of Current as well as Future Games, ACM Portal, [accessed 10th May 2008], url: http://portal.acm.org/citation.cfm?id=1198555.1198762

Schmittler J., Pohl D., Dahmen T., Vogelgesang C., as well as Slusallek P., 2004, Realtime Ray Tracing of Dynamic Scenes on an FPGA Chip, ACM Portal, [accessed 10th May 2008], url: http://portal.acm.org/ft_gateway.cfm?id=1058143&amp;type=pdf&amp;coll=GUIDE&amp;dl=GUIDE&amp;CFID=68260220&amp;CFTOKEN=96881082

Friedrich H., Gunther J., Dietrich A., Scherbaum M., Seidel Hans-Peter, Slusallek P., 2006, Exploring the Use of Ray Tracing on behalf of Future Games, ACM Portal, [accessed 10th May 2008], url: http://portal.acm.org/ft_gateway.cfm?id=1183323&amp;type=pdf&amp;coll=GUIDE&amp;dl=GUIDE&amp;CFID=68260623&amp;CFTOKEN=18219301

Woop Sven, Schmittler Jorg, Slusallek Philipp, 2005. RPU: a programmable ray processing unit on behalf of realtime ray tracing, ACM Portal, [accessed 10th May 2008], url:  http://portal.acm.org/ft_gateway.cfm?id=1073211&amp;type=pdf&amp;coll=GUIDE&amp;dl=GUIDE&amp;CFID=68261058&amp;CFTOKEN=632516362

Zhou Kun, Hou Qiming, Wang Rui, Guo Baining,2008. Real-Time KD-Tree Construction on Graphics Hardware, Microsoft Research, [accessed 10th May 2008], url:  ftp://ftp.research.microsoft.com/pub/tr/TR-2008-52.pdf

Slusallek Philipp, Shirley Peter, Mark Bill, Stoll Gordon, Wald Ingo, 2005. Introduction to real time ray tracing – course 41, ACM Portal, http://portal.acm.org/ft_gateway.cfm?id=1183323&amp;type=pdf&amp;coll=GUIDE&amp;dl=GUIDE&amp;CFID=68262552&amp;CFTOKEN=78451416

Shevtsov Maxim, Soupikov Alexei, Kapustin Alexander, 2007.  Highly Parallel Fast KD-tree Construction on behalf of Interactive Ray Tracing of Dynamic Scenes, Intel corporation, [accessed 10th May 2008], url: http://www.google.co.uk/url?sa=t&amp;ct=res&amp;cd=1&amp;url=http%3A%2F%2Fkesen.huang.googlepages.com%2FIntel-EG07.pdf&amp;ei=GFUsSMazL5O-0QST1JSOBQ&amp;usg=AFQjCNHctNbBVxaKwpIZ7f71SQlBbvXobQ&amp;sig2=iFf0gqjzcfxGAIBflFwgOg

NVidia, 2007. NVIDIA CUDA Programming Guide, NVidia, [accessed 10th May 2008], url: http://www.google.co.uk/url?sa=t&amp;ct=res&amp;cd=1&amp;url=http%3A%2F%2Fdeveloper.download.nvidia.com%2Fcompute%2Fcuda%2F1_0%2FNVIDIA_CUDA_Programming_Guide_1.0.pdf&amp;ei=pVYsSJGFEKa8QLXogasF&amp;usg=AFQjCNHrbSl3bDFxVTvtHfjJ8RKpjlgxzg&amp;sig2=Df-Ib4yBF9BFNqi_dwWAeg

Rademacher Paul, Ray Tracing: Graphics on behalf of the Masses, The University of North Carolina, [accessed 10th May 2008], url: http://www.cs.unc.edu/~rademach/xroads-RT/RTarticle.html

Glassner Andrew S., 1989. An Introduction to ray tracing, Morgan Kaufmann

Foley James D., Dam Andries van, Feiner Steven K., Hughes John F., 1990. Foley, James D. Computer Graphics : Principles as well as Practice, USA: Adisson Wesley

Owen G. Scott, 1999. Siggraph Education Materials(online) , [accessed 10th May 2008], url: http://www.siggraph.org/education/materials/HyperGraph/raytrace/rtinter1.htm

Bikker Jacco, 2005. DevMaster, Rayt`racing: Theory &amp; Implementation Part 7, Kd-Trees as well as More Speed,. [accessed 10th May 2008], url: http://www.devmaster.net/articles/raytracing_series/part7.php

Chandran Sharat, 2002. University of Maryland web site Data Structures lecture notes, Introduction to kd-trees, [accessed 10th May 2008], url: http://www.cs.umd.edu/class/spring2002/cmsc420-0401/pbasic.pdf

Martin Eisemann, 2006. University of Carolo-Wilhelmina, Computer Graphics- kD-Tree as well as Optimizations on behalf of Ray Tracing, [accessed 10th May 2008], url: http://graphics.tu-bs.de/teaching/lectures/ws0607/CG1/slides/07-kD-Tree.pdf

Marlon John, LaMothe Andre, 2003.
Focus on Photon Mapping, Ohio:Premier Press.

CDR-INF, 2007. Real Time Ray-Tracing May Replace GPU Rasterization, [accessed 10th May 2008], url:
http://www.cdrinfo.com/Sections/News/Details.aspx?NewsId=21608

NIST (National Institute of Standards as well as Technology), 2005. K-D tree data structure, [accessed 10th May 2008], url:
 http://www.nist.gov/dads/HTML/kdtree.html


</description>
      <guid>http://popo894.friendlinkup.com/2008/09/06/my-masters-dissertationproposal-raytracing-on-cuda.html</guid>
      <pubDate>Sat, 06 Sep 2008 19:24:32 -0400</pubDate>
      <dc:creator>popo894</dc:creator>
    </item>
</channel></rss>