[PATCH] D150612: AMDGPU: Expand casted f16 fmed3 pattern to fmin/fmax on gfx8

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 23 01:16:14 PDT 2023


foad added a comment.

This is causing:

  FAIL: LLVM :: CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir (1 of 1)
  ******************** TEST 'LLVM :: CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir' FAILED ********************
  Script:
  --
  : 'RUN: at line 2';   /home/jayfoad2/llvm-release/bin/llc -mtriple=amdgcn-amd-mesa3d -mcpu=gfx1010 -run-pass=amdgpu-regbank-combiner -verify-machineinstrs /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir -o - | /home/jayfoad2/llvm-release/bin/FileCheck /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir
  --
  Exit Code: 1
  
  Command Output (stderr):
  --
  + : 'RUN: at line 2'
  + /home/jayfoad2/llvm-release/bin/llc -mtriple=amdgcn-amd-mesa3d -mcpu=gfx1010 -run-pass=amdgpu-regbank-combiner -verify-machineinstrs /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir -o -
  + /home/jayfoad2/llvm-release/bin/FileCheck /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir
  /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir:23:16: error: CHECK-NEXT: expected string not found in input
   ; CHECK-NEXT: [[AMDGPU_CLAMP:%[0-9]+]]:vgpr(s32) = nnan G_AMDGPU_CLAMP [[FMUL]]
                 ^
  <stdin>:151:30: note: scanning from here
   %3:vgpr(s32) = G_FMUL %0, %2
                               ^
  <stdin>:151:30: note: with "FMUL" equal to "%3"
   %3:vgpr(s32) = G_FMUL %0, %2
                               ^
  <stdin>:154:2: note: possible intended match here
   %6:vgpr(s32) = COPY %5(s32)
   ^
  /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir:58:16: error: CHECK-NEXT: expected string not found in input
   ; CHECK-NEXT: [[AMDGPU_CLAMP:%[0-9]+]]:vgpr(s16) = nnan G_AMDGPU_CLAMP [[FMUL]]
                 ^
  <stdin>:269:30: note: scanning from here
   %4:vgpr(s16) = G_FMUL %1, %3
                               ^
  <stdin>:269:30: note: with "FMUL" equal to "%4"
   %4:vgpr(s16) = G_FMUL %1, %3
                               ^
  <stdin>:272:2: note: possible intended match here
   %7:vgpr(s16) = COPY %6(s16)
   ^
  /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir:96:16: error: CHECK-NEXT: expected string not found in input
   ; CHECK-NEXT: [[AMDGPU_CLAMP:%[0-9]+]]:vgpr(s32) = G_AMDGPU_CLAMP [[FMINNUM_IEEE]]
                 ^
  <stdin>:387:38: note: scanning from here
   %4:vgpr(s32) = G_FMINNUM_IEEE %2, %3
                                       ^
  <stdin>:387:38: note: with "FMINNUM_IEEE" equal to "%4"
   %4:vgpr(s32) = G_FMINNUM_IEEE %2, %3
                                       ^
  <stdin>:390:2: note: possible intended match here
   %7:vgpr(s32) = COPY %6(s32)
   ^
  /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir:131:16: error: CHECK-NEXT: expected string not found in input
   ; CHECK-NEXT: [[AMDGPU_CLAMP:%[0-9]+]]:vgpr(s32) = G_AMDGPU_CLAMP [[FMUL]]
                 ^
  <stdin>:502:30: note: scanning from here
   %3:vgpr(s32) = G_FMUL %0, %2
                               ^
  <stdin>:502:30: note: with "FMUL" equal to "%3"
   %3:vgpr(s32) = G_FMUL %0, %2
                               ^
  <stdin>:505:2: note: possible intended match here
   %6:vgpr(s32) = COPY %5(s32)
   ^
  /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir:245:16: error: CHECK-NEXT: expected string not found in input
   ; CHECK-NEXT: [[AMDGPU_CLAMP:%[0-9]+]]:vgpr(s32) = G_AMDGPU_CLAMP [[FMUL]]
                 ^
  <stdin>:849:30: note: scanning from here
   %3:vgpr(s32) = G_FMUL %0, %2
                               ^
  <stdin>:849:30: note: with "FMUL" equal to "%3"
   %3:vgpr(s32) = G_FMUL %0, %2
                               ^
  <stdin>:852:2: note: possible intended match here
   %6:vgpr(s32) = COPY %5(s32)
   ^
  
  Input file: <stdin>
  Check file: /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir
  
  -dump-input=help explains the following input dump.
  
  Input was:
  <<<<<<
              .
              .
              .
            111:  waveLimiter: false 
            112:  hasSpilledSGPRs: false 
            113:  hasSpilledVGPRs: false 
            114:  scratchRSrcReg: '$private_rsrc_reg' 
            115:  frameOffsetReg: '$fp_reg' 
            116:  stackPtrOffsetReg: '$sp_reg' 
            117:  bytesInStackArgArea: 0 
            118:  returnsVoid: true 
            119:  argumentInfo: 
            120:  privateSegmentBuffer: { reg: '$sgpr0_sgpr1_sgpr2_sgpr3' } 
            121:  dispatchPtr: { reg: '$sgpr4_sgpr5' } 
            122:  queuePtr: { reg: '$sgpr6_sgpr7' } 
            123:  dispatchID: { reg: '$sgpr10_sgpr11' } 
            124:  workGroupIDX: { reg: '$sgpr12' } 
            125:  workGroupIDY: { reg: '$sgpr13' } 
            126:  workGroupIDZ: { reg: '$sgpr14' } 
            127:  LDSKernelId: { reg: '$sgpr15' } 
            128:  implicitArgPtr: { reg: '$sgpr8_sgpr9' } 
            129:  workItemIDX: { reg: '$vgpr31', mask: 1023 } 
            130:  workItemIDY: { reg: '$vgpr31', mask: 1047552 } 
            131:  workItemIDZ: { reg: '$vgpr31', mask: 1072693248 } 
            132:  psInputAddr: 0 
            133:  psInputEnable: 0 
            134:  mode: 
            135:  ieee: true 
            136:  dx10-clamp: true 
            137:  fp32-input-denormals: true 
            138:  fp32-output-denormals: true 
            139:  fp64-fp16-input-denormals: true 
            140:  fp64-fp16-output-denormals: true 
            141:  highBitsOf32BitAddress: 0 
            142:  occupancy: 16 
            143:  vgprForAGPRCopy: '' 
            144: body: | 
            145:  bb.0: 
            146:  liveins: $vgpr0 
            147:   
            148:  %0:vgpr(s32) = COPY $vgpr0 
            149:  %1:sgpr(s32) = G_FCONSTANT float 2.000000e+00 
            150:  %2:vgpr(s32) = COPY %1(s32) 
            151:  %3:vgpr(s32) = G_FMUL %0, %2 
  next:23'0                                   X error: no match found
  next:23'1                                     with "FMUL" equal to "%3"
            152:  %4:sgpr(s32) = G_FCONSTANT float 1.000000e+00 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            153:  %5:sgpr(s32) = G_FCONSTANT float 0.000000e+00 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            154:  %6:vgpr(s32) = COPY %5(s32) 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  next:23'2       ?                            possible intended match
            155:  %7:vgpr(s32) = COPY %4(s32) 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            156:  %8:vgpr(s32) = nnan G_INTRINSIC intrinsic(@llvm.amdgcn.fmed3), %3(s32), %6(s32), %7(s32) 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            157:  $vgpr0 = COPY %8(s32) 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~
            158:  
  next:23'0      ~
            159: ... 
  next:23'0      ~~~~
            160: --- 
  next:23'0      ~~~~
            161: name: test_fmed3_f16_known_nnan_ieee_false 
  next:23'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            162: alignment: 1 
            163: exposesReturnsTwice: false 
            164: legalized: true 
            165: regBankSelected: true 
            166: selected: false 
            167: failedISel: false 
            168: tracksRegLiveness: true 
            169: hasWinCFI: false 
            170: callsEHReturn: false 
            171: callsUnwindInit: false 
            172: hasEHCatchret: false 
            173: hasEHScopes: false 
            174: hasEHFunclets: false 
            175: isOutlined: false 
            176: debugInstrRef: false 
            177: failsVerification: false 
            178: tracksDebugUserValues: false 
            179: registers: 
            180:  - { id: 0, class: vgpr, preferred-register: '' } 
            181:  - { id: 1, class: vgpr, preferred-register: '' } 
            182:  - { id: 2, class: sgpr, preferred-register: '' } 
            183:  - { id: 3, class: vgpr, preferred-register: '' } 
            184:  - { id: 4, class: vgpr, preferred-register: '' } 
            185:  - { id: 5, class: sgpr, preferred-register: '' } 
            186:  - { id: 6, class: sgpr, preferred-register: '' } 
            187:  - { id: 7, class: vgpr, preferred-register: '' } 
            188:  - { id: 8, class: vgpr, preferred-register: '' } 
            189:  - { id: 9, class: vgpr, preferred-register: '' } 
            190:  - { id: 10, class: vgpr, preferred-register: '' } 
            191: liveins: [] 
            192: frameInfo: 
            193:  isFrameAddressTaken: false 
            194:  isReturnAddressTaken: false 
              .
              .
              .
            229:  hasSpilledSGPRs: false 
            230:  hasSpilledVGPRs: false 
            231:  scratchRSrcReg: '$private_rsrc_reg' 
            232:  frameOffsetReg: '$fp_reg' 
            233:  stackPtrOffsetReg: '$sp_reg' 
            234:  bytesInStackArgArea: 0 
            235:  returnsVoid: true 
            236:  argumentInfo: 
            237:  privateSegmentBuffer: { reg: '$sgpr0_sgpr1_sgpr2_sgpr3' } 
            238:  dispatchPtr: { reg: '$sgpr4_sgpr5' } 
            239:  queuePtr: { reg: '$sgpr6_sgpr7' } 
            240:  dispatchID: { reg: '$sgpr10_sgpr11' } 
            241:  workGroupIDX: { reg: '$sgpr12' } 
            242:  workGroupIDY: { reg: '$sgpr13' } 
            243:  workGroupIDZ: { reg: '$sgpr14' } 
            244:  LDSKernelId: { reg: '$sgpr15' } 
            245:  implicitArgPtr: { reg: '$sgpr8_sgpr9' } 
            246:  workItemIDX: { reg: '$vgpr31', mask: 1023 } 
            247:  workItemIDY: { reg: '$vgpr31', mask: 1047552 } 
            248:  workItemIDZ: { reg: '$vgpr31', mask: 1072693248 } 
            249:  psInputAddr: 0 
            250:  psInputEnable: 0 
            251:  mode: 
            252:  ieee: false 
            253:  dx10-clamp: true 
            254:  fp32-input-denormals: true 
            255:  fp32-output-denormals: true 
            256:  fp64-fp16-input-denormals: true 
            257:  fp64-fp16-output-denormals: true 
            258:  highBitsOf32BitAddress: 0 
            259:  occupancy: 16 
            260:  vgprForAGPRCopy: '' 
            261: body: | 
            262:  bb.0: 
            263:  liveins: $vgpr0 
            264:   
            265:  %0:vgpr(s32) = COPY $vgpr0 
            266:  %1:vgpr(s16) = G_TRUNC %0(s32) 
            267:  %2:sgpr(s16) = G_FCONSTANT half 0xH4000 
            268:  %3:vgpr(s16) = COPY %2(s16) 
            269:  %4:vgpr(s16) = G_FMUL %1, %3 
  next:58'0                                   X error: no match found
  next:58'1                                     with "FMUL" equal to "%4"
            270:  %5:sgpr(s16) = G_FCONSTANT half 0xH3C00 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            271:  %6:sgpr(s16) = G_FCONSTANT half 0xH0000 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            272:  %7:vgpr(s16) = COPY %6(s16) 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  next:58'2       ?                            possible intended match
            273:  %8:vgpr(s16) = COPY %5(s16) 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            274:  %9:vgpr(s16) = nnan G_INTRINSIC intrinsic(@llvm.amdgcn.fmed3), %4(s16), %7(s16), %8(s16) 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            275:  %10:vgpr(s32) = G_ANYEXT %9(s16) 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            276:  $vgpr0 = COPY %10(s32) 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~
            277:  
  next:58'0      ~
            278: ... 
  next:58'0      ~~~~
            279: --- 
  next:58'0      ~~~~
            280: name: test_fmed3_non_SNaN_input_ieee_true_dx10clamp_true 
  next:58'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            281: alignment: 1 
            282: exposesReturnsTwice: false 
            283: legalized: true 
            284: regBankSelected: true 
            285: selected: false 
            286: failedISel: false 
            287: tracksRegLiveness: true 
            288: hasWinCFI: false 
            289: callsEHReturn: false 
            290: callsUnwindInit: false 
            291: hasEHCatchret: false 
            292: hasEHScopes: false 
            293: hasEHFunclets: false 
            294: isOutlined: false 
            295: debugInstrRef: false 
            296: failsVerification: false 
            297: tracksDebugUserValues: false 
            298: registers: 
            299:  - { id: 0, class: vgpr, preferred-register: '' } 
            300:  - { id: 1, class: sgpr, preferred-register: '' } 
            301:  - { id: 2, class: vgpr, preferred-register: '' } 
            302:  - { id: 3, class: vgpr, preferred-register: '' } 
            303:  - { id: 4, class: vgpr, preferred-register: '' } 
            304:  - { id: 5, class: sgpr, preferred-register: '' } 
            305:  - { id: 6, class: sgpr, preferred-register: '' } 
            306:  - { id: 7, class: vgpr, preferred-register: '' } 
            307:  - { id: 8, class: vgpr, preferred-register: '' } 
            308:  - { id: 9, class: vgpr, preferred-register: '' } 
            309: liveins: [] 
            310: frameInfo: 
            311:  isFrameAddressTaken: false 
            312:  isReturnAddressTaken: false 
              .
              .
              .
            347:  hasSpilledSGPRs: false 
            348:  hasSpilledVGPRs: false 
            349:  scratchRSrcReg: '$private_rsrc_reg' 
            350:  frameOffsetReg: '$fp_reg' 
            351:  stackPtrOffsetReg: '$sp_reg' 
            352:  bytesInStackArgArea: 0 
            353:  returnsVoid: true 
            354:  argumentInfo: 
            355:  privateSegmentBuffer: { reg: '$sgpr0_sgpr1_sgpr2_sgpr3' } 
            356:  dispatchPtr: { reg: '$sgpr4_sgpr5' } 
            357:  queuePtr: { reg: '$sgpr6_sgpr7' } 
            358:  dispatchID: { reg: '$sgpr10_sgpr11' } 
            359:  workGroupIDX: { reg: '$sgpr12' } 
            360:  workGroupIDY: { reg: '$sgpr13' } 
            361:  workGroupIDZ: { reg: '$sgpr14' } 
            362:  LDSKernelId: { reg: '$sgpr15' } 
            363:  implicitArgPtr: { reg: '$sgpr8_sgpr9' } 
            364:  workItemIDX: { reg: '$vgpr31', mask: 1023 } 
            365:  workItemIDY: { reg: '$vgpr31', mask: 1047552 } 
            366:  workItemIDZ: { reg: '$vgpr31', mask: 1072693248 } 
            367:  psInputAddr: 0 
            368:  psInputEnable: 0 
            369:  mode: 
            370:  ieee: true 
            371:  dx10-clamp: true 
            372:  fp32-input-denormals: true 
            373:  fp32-output-denormals: true 
            374:  fp64-fp16-input-denormals: true 
            375:  fp64-fp16-output-denormals: true 
            376:  highBitsOf32BitAddress: 0 
            377:  occupancy: 16 
            378:  vgprForAGPRCopy: '' 
            379: body: | 
            380:  bb.0: 
            381:  liveins: $vgpr0 
            382:   
            383:  %0:vgpr(s32) = COPY $vgpr0 
            384:  %1:sgpr(s32) = G_FCONSTANT float 1.000000e+01 
            385:  %2:vgpr(s32) = G_FCANONICALIZE %0 
            386:  %3:vgpr(s32) = COPY %1(s32) 
            387:  %4:vgpr(s32) = G_FMINNUM_IEEE %2, %3 
  next:96'0                                           X error: no match found
  next:96'1                                             with "FMINNUM_IEEE" equal to "%4"
            388:  %5:sgpr(s32) = G_FCONSTANT float 1.000000e+00 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            389:  %6:sgpr(s32) = G_FCONSTANT float 0.000000e+00 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            390:  %7:vgpr(s32) = COPY %6(s32) 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  next:96'2       ?                            possible intended match
            391:  %8:vgpr(s32) = COPY %5(s32) 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            392:  %9:vgpr(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fmed3), %4(s32), %7(s32), %8(s32) 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            393:  $vgpr0 = COPY %9(s32) 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~
            394:  
  next:96'0      ~
            395: ... 
  next:96'0      ~~~~
            396: --- 
  next:96'0      ~~~~
            397: name: test_fmed3_maybe_SNaN_input_zero_third_operand_ieee_true_dx10clamp_true 
  next:96'0      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            398: alignment: 1 
            399: exposesReturnsTwice: false 
            400: legalized: true 
            401: regBankSelected: true 
            402: selected: false 
            403: failedISel: false 
            404: tracksRegLiveness: true 
            405: hasWinCFI: false 
            406: callsEHReturn: false 
            407: callsUnwindInit: false 
            408: hasEHCatchret: false 
            409: hasEHScopes: false 
            410: hasEHFunclets: false 
            411: isOutlined: false 
            412: debugInstrRef: false 
            413: failsVerification: false 
            414: tracksDebugUserValues: false 
            415: registers: 
            416:  - { id: 0, class: vgpr, preferred-register: '' } 
            417:  - { id: 1, class: sgpr, preferred-register: '' } 
            418:  - { id: 2, class: vgpr, preferred-register: '' } 
            419:  - { id: 3, class: vgpr, preferred-register: '' } 
            420:  - { id: 4, class: sgpr, preferred-register: '' } 
            421:  - { id: 5, class: sgpr, preferred-register: '' } 
            422:  - { id: 6, class: vgpr, preferred-register: '' } 
            423:  - { id: 7, class: vgpr, preferred-register: '' } 
            424:  - { id: 8, class: vgpr, preferred-register: '' } 
            425: liveins: [] 
            426: frameInfo: 
            427:  isFrameAddressTaken: false 
            428:  isReturnAddressTaken: false 
            429:  hasStackMap: false 
            430:  hasPatchPoint: false 
              .
              .
              .
            462:  waveLimiter: false 
            463:  hasSpilledSGPRs: false 
            464:  hasSpilledVGPRs: false 
            465:  scratchRSrcReg: '$private_rsrc_reg' 
            466:  frameOffsetReg: '$fp_reg' 
            467:  stackPtrOffsetReg: '$sp_reg' 
            468:  bytesInStackArgArea: 0 
            469:  returnsVoid: true 
            470:  argumentInfo: 
            471:  privateSegmentBuffer: { reg: '$sgpr0_sgpr1_sgpr2_sgpr3' } 
            472:  dispatchPtr: { reg: '$sgpr4_sgpr5' } 
            473:  queuePtr: { reg: '$sgpr6_sgpr7' } 
            474:  dispatchID: { reg: '$sgpr10_sgpr11' } 
            475:  workGroupIDX: { reg: '$sgpr12' } 
            476:  workGroupIDY: { reg: '$sgpr13' } 
            477:  workGroupIDZ: { reg: '$sgpr14' } 
            478:  LDSKernelId: { reg: '$sgpr15' } 
            479:  implicitArgPtr: { reg: '$sgpr8_sgpr9' } 
            480:  workItemIDX: { reg: '$vgpr31', mask: 1023 } 
            481:  workItemIDY: { reg: '$vgpr31', mask: 1047552 } 
            482:  workItemIDZ: { reg: '$vgpr31', mask: 1072693248 } 
            483:  psInputAddr: 0 
            484:  psInputEnable: 0 
            485:  mode: 
            486:  ieee: true 
            487:  dx10-clamp: true 
            488:  fp32-input-denormals: true 
            489:  fp32-output-denormals: true 
            490:  fp64-fp16-input-denormals: true 
            491:  fp64-fp16-output-denormals: true 
            492:  highBitsOf32BitAddress: 0 
            493:  occupancy: 16 
            494:  vgprForAGPRCopy: '' 
            495: body: | 
            496:  bb.0: 
            497:  liveins: $vgpr0 
            498:   
            499:  %0:vgpr(s32) = COPY $vgpr0 
            500:  %1:sgpr(s32) = G_FCONSTANT float 2.000000e+00 
            501:  %2:vgpr(s32) = COPY %1(s32) 
            502:  %3:vgpr(s32) = G_FMUL %0, %2 
  next:131'0                                  X error: no match found
  next:131'1                                    with "FMUL" equal to "%3"
            503:  %4:sgpr(s32) = G_FCONSTANT float 0.000000e+00 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            504:  %5:sgpr(s32) = G_FCONSTANT float 1.000000e+00 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            505:  %6:vgpr(s32) = COPY %5(s32) 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  next:131'2      ?                            possible intended match
            506:  %7:vgpr(s32) = COPY %4(s32) 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            507:  %8:vgpr(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fmed3), %3(s32), %6(s32), %7(s32) 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            508:  $vgpr0 = COPY %8(s32) 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~
            509:  
  next:131'0     ~
            510: ... 
  next:131'0     ~~~~
            511: --- 
  next:131'0     ~~~~
            512: name: test_fmed3_f32_maybe_NaN_ieee_false 
  next:131'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            513: alignment: 1 
            514: exposesReturnsTwice: false 
            515: legalized: true 
            516: regBankSelected: true 
            517: selected: false 
            518: failedISel: false 
            519: tracksRegLiveness: true 
            520: hasWinCFI: false 
            521: callsEHReturn: false 
            522: callsUnwindInit: false 
            523: hasEHCatchret: false 
            524: hasEHScopes: false 
            525: hasEHFunclets: false 
            526: isOutlined: false 
            527: debugInstrRef: false 
            528: failsVerification: false 
            529: tracksDebugUserValues: false 
            530: registers: 
            531:  - { id: 0, class: vgpr, preferred-register: '' } 
            532:  - { id: 1, class: sgpr, preferred-register: '' } 
            533:  - { id: 2, class: vgpr, preferred-register: '' } 
            534:  - { id: 3, class: vgpr, preferred-register: '' } 
            535:  - { id: 4, class: sgpr, preferred-register: '' } 
            536:  - { id: 5, class: sgpr, preferred-register: '' } 
            537:  - { id: 6, class: vgpr, preferred-register: '' } 
            538:  - { id: 7, class: vgpr, preferred-register: '' } 
            539:  - { id: 8, class: vgpr, preferred-register: '' } 
            540: liveins: [] 
            541: frameInfo: 
            542:  isFrameAddressTaken: false 
            543:  isReturnAddressTaken: false 
            544:  hasStackMap: false 
            545:  hasPatchPoint: false 
              .
              .
              .
            809:  waveLimiter: false 
            810:  hasSpilledSGPRs: false 
            811:  hasSpilledVGPRs: false 
            812:  scratchRSrcReg: '$private_rsrc_reg' 
            813:  frameOffsetReg: '$fp_reg' 
            814:  stackPtrOffsetReg: '$sp_reg' 
            815:  bytesInStackArgArea: 0 
            816:  returnsVoid: true 
            817:  argumentInfo: 
            818:  privateSegmentBuffer: { reg: '$sgpr0_sgpr1_sgpr2_sgpr3' } 
            819:  dispatchPtr: { reg: '$sgpr4_sgpr5' } 
            820:  queuePtr: { reg: '$sgpr6_sgpr7' } 
            821:  dispatchID: { reg: '$sgpr10_sgpr11' } 
            822:  workGroupIDX: { reg: '$sgpr12' } 
            823:  workGroupIDY: { reg: '$sgpr13' } 
            824:  workGroupIDZ: { reg: '$sgpr14' } 
            825:  LDSKernelId: { reg: '$sgpr15' } 
            826:  implicitArgPtr: { reg: '$sgpr8_sgpr9' } 
            827:  workItemIDX: { reg: '$vgpr31', mask: 1023 } 
            828:  workItemIDY: { reg: '$vgpr31', mask: 1047552 } 
            829:  workItemIDZ: { reg: '$vgpr31', mask: 1072693248 } 
            830:  psInputAddr: 0 
            831:  psInputEnable: 0 
            832:  mode: 
            833:  ieee: true 
            834:  dx10-clamp: true 
            835:  fp32-input-denormals: true 
            836:  fp32-output-denormals: true 
            837:  fp64-fp16-input-denormals: true 
            838:  fp64-fp16-output-denormals: true 
            839:  highBitsOf32BitAddress: 0 
            840:  occupancy: 16 
            841:  vgprForAGPRCopy: '' 
            842: body: | 
            843:  bb.0: 
            844:  liveins: $vgpr0 
            845:   
            846:  %0:vgpr(s32) = COPY $vgpr0 
            847:  %1:sgpr(s32) = G_FCONSTANT float 2.000000e+00 
            848:  %2:vgpr(s32) = COPY %1(s32) 
            849:  %3:vgpr(s32) = G_FMUL %0, %2 
  next:245'0                                  X error: no match found
  next:245'1                                    with "FMUL" equal to "%3"
            850:  %4:sgpr(s32) = G_FCONSTANT float 1.000000e+00 
  next:245'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            851:  %5:sgpr(s32) = G_FCONSTANT float 0.000000e+00 
  next:245'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            852:  %6:vgpr(s32) = COPY %5(s32) 
  next:245'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  next:245'2      ?                            possible intended match
            853:  %7:vgpr(s32) = COPY %4(s32) 
  next:245'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            854:  %8:vgpr(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.fmed3), %3(s32), %6(s32), %7(s32) 
  next:245'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            855:  $vgpr0 = COPY %8(s32) 
  next:245'0     ~~~~~~~~~~~~~~~~~~~~~~~
            856:  
  next:245'0     ~
            857: ... 
  next:245'0     ~~~~
  >>>>>>
  
  --
  
  ********************
  ********************
  Failed Tests (1):
    LLVM :: CodeGen/AMDGPU/GlobalISel/regbankcombiner-clamp-fmed3-const.mir


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150612/new/

https://reviews.llvm.org/D150612



More information about the llvm-commits mailing list